Modified TGF-beta superfamily proteins

ABSTRACT

The invention provides modified TGF-β family proteins having altered biological or biochemical properties, and methods for making them. Specific modified protein constructs include TGF-β family member proteins that have N-terminal truncations, “latent” proteins, fusion proteins and heterodimers.

CONTINUING APPLICATION DATA

The instant utility application claims priority to U.S. provisionalpatent application No. 60/103,418, filed on Oct. 7, 1998, the entirecontents of which is incorporated herein by reference; and the instantapplication is related to co-pending utility applications U.S. Ser. Nos.______ and ______ (Attorney Docket Nos. STK-076 and STK-077) filed oneven date herewith and also based on the aforementioned provisionalapplication, the disclosures of which are incorporated herein byreference.

FIELD OF THE INVENTION

The invention relates to recombinant proteins having improved refoldingproperties, improved physical properties (such as solubility andstability), improved biological activity, including altered receptorbinding, improved targeting capabilities, latent forms of proteins, andmethods for producing such proteins. More particularly, the inventionrelates to biosynthetic members of the TGF-β super-family ofstructurally-related proteins. Such modified protein constructs includeTGF-β family member proteins that have N-terminal truncations, “latent”proteins, fusion proteins and heterodimers.

BACKGROUND OF THE INVENTION

The TGF-β superfamily includes five distinct forms of TGF-β (Sporn andRoberts (1990) in Peptide Growth Factors and Their Receptors, Sporn andRoberts, eds., Springer-Verlag: Berlin pp. 419-472), as well as thedifferentiation factors vg-1 (Weeks and Melton (1987) Cell 51: 861-867),DPP-C polypeptide (Padgett et al. (1987) Nature 325: 81-84), thehormones activin and inhibin (Mason et al. (1985) Nature 318: 659-663;Mason et al. (1987) Growth Factors 1: 77-88), the Mullerian-inhibitingsubstance, MIS (Cate et al. (1986) Cell 45:685-698), osteogenic andmorphogenic proteins OP-1 (PCT/US90/05903), OP-2 (PCT/US91/07654), OP-3(PCT/WO94/10202), the BMPs, (see U.S. Pat. Nos. 4,877,864; 5,141,905;5,013,649; 5,116,738; 5,108,922; 5,106,748; and 5,155,058), thedevelopmentally regulated protein VGR-1 (Lyons et al. (1989) Proc. Natl.Acad. Sci. USA 86: 4554-4558), cartilage-derived growth factors CDMP-1,CDMP-2 and CDMP-3 (or GDF-5, GDF-6 and GDF-7), and thegrowth/differentiation factors GDF-1, GDF-3, GDF-9 and dorsalin-1(McPherron et al. (1993) J. Biol. Chem. 268: 3444-3449; Basler et al.(1993) Cell 73: 687-702).

The proteins of the TGF-β superfamily are disulfide-linked homo- orheterodimers that are expressed as large precursor polypeptide chainscontaining a hydrophobic signal sequence, a long and relatively poorlyconserved N-terminal pro region sequence of several hundred amino acids,a cleavage site, and a mature domain comprising an N-terminal regionthat varies among the family members and a more highly conservedC-terminal region. This C-terminal region, present in the processedmature proteins of all known family members, contains approximately 100amino acids with a characteristic cysteine motif having a conserved sixor seven cysteine skeleton. Although the position of the cleavage sitebetween the mature and pro regions varies among the family members, thecysteine pattern of the C-terminus of all of the proteins is in theidentical format, ending in the sequence Cys-X-Cys-X (Sporn and Roberts(1990), supra).

Recombinant TGF-β1 has been cloned (Derynck et al. (1985) Nature 316:701-705), and expressed in Chinese hamster ovary cells (Gentry et al.(1987) Mol. Cell. Biol. 7: 3418-3427). Additionally, recombinant humanTGF-β2 (deMartin et al. (1987) EMBO J. 6: 3673), as well as human andporcine TGF-β3 (Derynck et al. (1988) EMBO J. 7: 3737-3743; Dijke et al.(1988) Proc. Natl. Acad. Sci. USA 85: 4715), have been cloned.Expression levels of the mature TGF-β1 protein in COS cells have beenincreased by substituting cysteine residues located in the pro region ofthe TGF-β1 precursor with serine residues (Brunner et al. (1989) J.Biol. Chem. 264: 13660-13664).

A unifying feature of the biology of the proteins of the TGF-βsuperfamily is their ability to regulate developmental processes. Thesestructurally related proteins have been identified as being involved ina variety of developmental events. For example, TGF-β and thepolypeptides of the inhibin/activin group appear to play a role in theregulation of cell growth and differentiation. MIS causes regression ofthe Mullerian duct in development of the mammalian male embryo, and dpp,the gene product of the Drosophila decapentaplegic complex, is requiredfor appropriate dorsal-ventral specification. Similarly, Vg-1 isinvolved in mesoderm induction in Xenopus, and Vgr-1 has been identifiedin a variety of developing murine tissues. Regarding bone formation,many of the proteins in the TGF-β supergene family, namely OP-1 and asubset of the BMPs, apparently play the major role. OP-1 (BMP-7) andother osteogenic proteins have been produced using recombinanttechniques (U.S. Pat. No. 5,011,691 and PCT Application No. US 90/05903)and shown to be able to induce formation of true endochondral bone invivo. BMP-2 has been recombinantly produced in monkey COS-1 cells andChinese hamster ovary cells (Wang et al. (1990) Proc. Natl. Acad. Sci.USA 87: 2220-2224).

Recently the family of proteins taught as having osteogenic activity asjudged by the Sampath and Reddi bone formation assay have been shown tobe morphogenic, i.e., capable of inducing the developmental cascade oftissue morphogenesis in a mature mammal (See PCT Application No. US92/01968). In particular, these proteins are capable of inducing theproliferation of uncommitted progenitor cells, and inducing thedifferentiation of these stimulated progenitor cells in atissue-specific manner under appropriate environmental conditions. Inaddition, the morphogens are capable of supporting the growth andmaintenance of these differentiated cells. These morphogenic activitiesallow the proteins to initiate and maintain the developmental cascade oftissue morphogenesis in an appropriate, morphogenically permissiveenvironment, stimulating stem cells to proliferate and differentiate ina tissue-specific manner, and inducing the progression of events thatculminate in new tissue formation. These morphogenic activities alsoallow the proteins to induce the “redifferentiation” of cells previouslystimulated to stray from their differentiation path. Under appropriateenvironmental conditions it is anticipated that these morphogens alsomay stimulate the “redifferentiation” of committed cells.

The osteogenic proteins generally are classified in the art as asubgroup of the TGF-β superfamily of growth factors (Hogan (1996), Genes& Development, 10:1580-1594), and are variously termed “osteogenicproteins”, “morphogenic proteins”, “morphogens”, “bone morphogenicproteins” or “BMPs” are identified by their ability to induce ectopic,endochondral bone morphogenesis. Members of the morphogen family ofproteins include the mammalian osteogenic protein-1 (OP-1, also known asBMP-7, and the Drosophila homolog 60A), osteogenic protein-2 (OP-2, alsoknown as BMP-8), osteogenic protein-3 (OP-3), BMP-2 (also known asBMP-2A or CBMP-2A, and the Drosophila homolog DPP), BMP-3, BMP-4 (alsoknown as BMP-2B or CBMP-2B), BMP-5, BMP-6 and its murine homolog Vgr-1,BMP-9, BMP-10, BMP-11, BMP-12, GDF3 (also known as Vgr2), GDF-8, GDF-9,GDF-10, GDF-11, GDF-12, BMP-13, BMP-14, BMP-15, GDF-5 (also known asCDMP-1 or MP52), GDF-6 (also known as CDMP-2 or BMP-13), GDF-7 (alsoknown as CDMP-3 or BMP-12), the Xenopus homolog Vg1 and NODAL, UNIVIN,SCREW, ADMP, and NEURAL.

Whether naturally-occurring or synthetically prepared, osteogenicproteins, can induce recruitment and/or stimulation of progenitor cells,thereby inducing their differentiation into chondrocytes andosteoblasts, and further inducing differentiation of intermediatecartilage, vascularization, bone formation, remodeling, and, finally,marrow differentiation. Furthermore, numerous practitioners havedemonstrated the ability of these osteogenic proteins, when admixed witheither naturally-sourced matrix materials such as collagen orsynthetically-prepared polymeric matrix materials, to induce boneformation, including membraneous and endochondral bone formation, underconditions where true replacement bone would not otherwise occur. Forexample, when combined with a matrix material, these osteogenic proteinsinduce formation of new bone in large segmental bone defects, spinalfusions, clavarial defects, and fractures.

Bacterial and other prokaryotic expression systems are relied on in theart as preferred means for generating recombinant proteins. Prokaryoticsystems such as E. coli are useful for producing commercial quantitiesof proteins, as well as for evaluating biological properties ofnaturally occurring or biosynthetic mutants and analogs. Typically, anover-expressed eukaryotic protein aggregates as an insolubleintracellular precipitate (“inclusion body”) in the prokaryote hostcell. The aggregated protein is then collected from the inclusionbodies, solubilized using one or more standard denaturing agents, andthen allowed, or induced, to refold into a functional state. Properrefolding to form a biologically active protein structure requiresproper formation of any disulfide bonds.

Chemical synthesis may also be employed to produce protein constructs.Technology is widely available to permit routine, automated assembly ofpeptide chains. Techniques are known in the art which utilize enzymaticand chemical methods for coupling peptide fragments into syntheticprotein molecules. See, e.g., Hilvert, Chem. Biol. (1994) 1(4): 201-03;Muir et al., Proc. Nat'l Acad. Sci. USA (1998) 95(12): 6705-10; Wallace,Curr. Opin. Biotechnol. (1995) 6(4): 403-10; Miranda et al., Proc. Nat'lAcad. Sci. USA (1999) 96(4): 1181-6; and Liu et al., Proc. Nat'l Acad.Sci. USA (1994) 91(14): 6584-8.

For example, the tertiary and quaternary structure of both TGF-β2 andOP-1 have been determined. Although TGF-β2 and OP-1 exhibit only about35% amino acid identity in their respective amino acid sequences thetertiary and quaternary structures of both molecules are strikinglysimilar. Both TGF-β2 and OP-1 are dimeric in nature and have a uniquefolding pattern involving six of the seven C-terminal cysteine residues,as illustrated in FIG. 1A. FIG. 1A shows that in each subunit fourcysteines bond to generate an eight residue ring, and two additionalcysteine residues form a disulfide bond that passes through the ring toform a knot-like structure. With a numbering scheme beginning with themost N-terminal cysteine of the 7 conserved cysteine residues assignednumber 1, the 2nd and 6th conserved cysteine residues bond to close oneside of the eight residue ring while the 3rd and 7th cysteine residuesclose the other side. The 1st and 5th conserved cysteine residues bondthrough the center of the ring to form the core of the knot. The 4thconserved cysteine forms an interchain disulfide bond with thecorresponding residue in the other subunit.

The TGF-β2 and OP-1 monomer subunits comprise three major structuralelements and an N-terminal region. The structural elements are made upof regions of contiguous polypeptide chain that possess over 50%secondary structure of the following types: (1) loop, (2) α-helix and(3) β-sheet. Furthermore, in these regions the N-terminal and C-terminalstrands are not more than 7 A° apart. The residues between the 1st and2nd conserved cysteines (FIG. 1A) form a structural region characterizedby an anti-parallel β-sheet finger, referred to herein as the finger 1region (F1). A ribbon trace of the finger 1 peptide backbone is shown inFIG. 1B. Similarly the residues between the 5th and 6th conservedcysteines in FIG. 1A also form an anti-parallel β-sheet finger, referredto herein as the finger 2 region (F2). A ribbon trace of the finger 2peptide backbone is shown in FIG. 1D. A β-sheet finger is a single aminoacid chain, comprising a β-strand that folds back on itself by means ofa β-turn or some larger loop so that the entering and exiting strandsform one or more anti-parallel β-sheet structures. The third majorstructural region, involving the residues between the 3rd and 4thconserved cysteines in FIG. 1A, is characterized by a three turn α-helixreferred to herein as the heel region (H). A ribbon trace of the heelpeptide backbone is shown in FIG. 1C.

The organization of the monomer structure is similar to that of a lefthand where the knot region is located at the position equivalent to thepalm, finger 1 is equivalent to the index and middle fingers, theα-helix is equivalent to the heel of the hand, and finger 2 isequivalent to the ring and small fingers. The N-terminal region (notwell defined in the published structures) is predicted to be located ata position roughly equivalent to the thumb.

In the dimeric forms of both TGF-β2 and OP-1, the subunits are orientedsuch that the heel region of one subunit contacts the finger regions ofthe other subunit with the knot regions of the connected subunitsforming the core of the molecule. The 4th cysteine forms a disulfidebridge with its counterpart on the second chain thereby equivalentlylinking the chains at the center of the palms. The dimer thus formed isan ellipsoidal (cigar shaped) molecule when viewed from the top lookingdown the two-fold axis of symmetry between the subunits (FIG. 2A).Viewed from the side, the molecule resembles a bent “cigar” since thetwo subunits are oriented at a slight angle relative to each other (FIG.2B).

However, not all solubilized heterologous proteins readily refold.Despite careful manipulation of refolding, the yields of properlyfolded, biologically active protein remain low. Many TBF-β familymembers, including BMPs, fall into the category of poor refolderproteins. While some members of the TBF-β protein family can be foldedefficiently in vitro as, for example, when produced in E. coli or otherprokaryotic hosts, many others, including BMP5, BMP6, and BMP7, cannot.See, e.g., EP 0433225, U.S. Pat. No. 5,399,677, U.S. Pat. No. 5,756,308and U.S. Pat. No. 5,804,416.

A need remains for improved means for producing in vitro recombinantBMPs and other TGF-β family proteins using prokaryotic as well aseukaryotic host cells.

SUMMARY OF THE INVENTION

The present invention provides modified TGF-β family proteins whichcomprise N-terminal extensions, truncations and other modifications atthe N-terminal end of C-terminal active domains. Modified proteins ofthe invention have altered refolding properties and altered solubilitywith respect to naturally occurring proteins when expressedrecombinantly. Modified proteins of the invention also have alteredactivity profiles, including enhanced specific activity, and areamenable to tissue-specific targeting or specific surface binding.

As a result of these discoveries, means are available for predicting anddesigning de novo BMPs and other TGF-β family member analogs havingaltered biological properties, including improved folding capabilitiesin vitro, improved solubility, altered stability, altered isoelectricpoints, and/or altered biological activities, as desired. Thesediscoveries also lend themselves to creating proteins whose activity canbe directed towards specific sites within a mammal and/or whose activitycan be regulated, inhibited and/or induced. The invention also providesmeans for easily and quickly evaluating biological and/or biochemicalproperties of candidate constructs, including mapping epitopes of foldedproteins.

The invention provides “mutant” forms of proteins that improve therefolding properties of “poor refolder” TGF-β family members. As usedherein, a “poor refolder” protein means any protein that, when inducedto refold under suitable refolding conditions, yields less than about 1%properly refolded material, as measured using a standard protocol (seebelow). As contemplated herein, “suitable refolding conditions” areconditions under which proteins can be refolded to the extent requiredto confer functionality. One skilled in the art will recognize that atleast Section IC and Example 3 of the “Detailed Description of thePreferred Embodiment” are non-limiting examples of such refoldingconditions. Structural parameters relevant to the compositions andmethods of the instant invention include one or more disulfide bridgesproperly distributed throughout the dimeric protein's structure andwhich require a reduction-oxidation (“redox”) reaction step to yield afolded structure. Redox reactions typically occur at neutral pH, i.e.,in the range of about pH 7.0-8.5, typically in the range of about pH7.5-8.5, and preferably under physiologically-compatible conditions. Theskilled artisan will appreciate and recognize optimal conditions forsuccess.

The proteins preferably are manufactured in accordance with theprinciples disclosed herein by assembly of nucleotides and/or joiningDNA restriction fragments to produce synthetic DNAs. The DNAs aretransfected into an appropriate protein expression vehicle, the encodedprotein expressed, folded if necessary, and purified. Particularconstructs can be tested for activity in vitro. The tertiary structureof the candidate protein constructs may be iteratively refined andbinding modulated by site-directed or nucleotide sequence directedmutagenesis aided by the principles disclosed herein, computer-basedprotein structure modeling, and recently developed rational drug designtechniques to improve or modulate specific properties of a molecule ofinterest. Known phage display or other nucleotide expression systems maybe exploited to produce simultaneously a large number of candidateconstructs. The pool of candidate constructs subsequently may bescreened for binding specificity using, for example, a chromatographycolumn comprising surface immobilized receptors, salt gradient elutionto select for, and to concentrate high binding candidates, and in vitroassays. Identification of a useful recombinant protein is followed byproduction of cell lines expressing commercially useful quantities ofthe protein for laboratory use and ultimately for producingtherapeutically useful drugs. It has now been discovered how to design,make, test and use chimeric proteins comprising an amino acid sequencewhich, when properly folded, assume a tertiary structure defining afinger 1 region, a finger 2 region, and a heel region.

All of the constructs of the invention comprise regions of amino acidsequences defining the regions required for utility, namely, finger 1,finger 2, and heel regions, and an additional region that can modifyactivity, namely the N-terminal peptide sequence. Sequences for thefinger and heel regions may be copied from the respective finger andheel region sequences of any known TGF-β superfamily member identifiedherein. Alternatively, the finger and heel regions may be selected fromthe amino acid sequence of a new member of this superfamily discoveredhereafter using the principles disclosed hereinbelow.

The finger and heel sequences also may be altered by amino acidsubstitution, for example by exploiting substitute amino acid residuesselected in accordance with the principles disclosed in Smith et al.(1990) Proc. Natl. Acad. Sci. USA 87: 118-122, the disclosure of whichis incorporated herein by reference. Smith et al. disclose an amino acidclass hierarchy, similar to the amino acid hierarchy table set forth inFIG. 3, which may be used to rationally substitute one amino acid foranother while minimizing gross conformational distortions of the typewhich otherwise may inactivate the protein. In any event, it iscontemplated that many synthetic finger 1, finger 2, and heel regionsequences, having only 70% homology with natural regions, preferably80%, and most preferably at least 90%, may be used to produce activemorphon constructs. It is contemplated also, as disclosed herein, thatthe size of the constructs may be reduced significantly by truncatingthe natural finger and heel regions of the template TGF-β superfamilymember.

As used herein, “acidic” or “negatively charged residues” are understoodto include any amino acid residue, naturally-occurring or synthetic,that typically carries a negative charge on its R group underphysiological conditions. Examples include, without limitation, asparticacid (“Asp”) and glutamic acid (“Glu”). Similarly, basic or positivelycharged residues include any amino acid residue, naturally-occurring orsynthetically created, that typically carries a positive charge on its Rgroup under physiological conditions. Examples include, withoutlimitation, arginine (“Arg”), lysine (“Lys”) and histidine (“His”). Asused herein, “hydrophilic” residues include both acidic and basic aminoacid residues, as well as uncharged residues carrying amide groups ontheir R groups, including, without limitation, glutamine (“Gln”) andasparagine (“Asn”), and polar residues carrying hydroxyl groups on theirR groups, including, without limitation, serine (“Ser”), tyrosine(“Tyr”) and threonine (“Thr”). A skilled artisan will appreciate thatthe actual physiological pK will vary, and that the charge will vary indifferent physiological environments.

As used herein, “biosynthesis” or “biosynthetic” means occurring as aresult of, or originating from a ligation of naturally-orsynthetically-derived fragments. For example, but not limited to,ligating peptide or nucleic acid fragments corresponding to one or moresubdomains (or fragments thereof) disclosed herein. “Chemosynthesis” or“chemosynthetic” means occurring as a result of, or originating from, achemical means of production. For example, but not limited to, synthesisof a peptide or nucleic acid sequence using a standard automatedsynthesizer/sequencer from a commercially-available source. It iscontemplated that both natural and non-natural amino acids can be usedto obtain the desired attributes, as taught herein. “Recombinant”production or technology means occurring as a result of, or originatingfrom, a genetically engineered means of production. For example, but notlimited to, expression of a genetically-engineered DNA sequence or geneencoding a chimeric protein (or fragment thereof) of the presentinvention. Also included within the meaning of the foregoing are theteachings set forth below in at least Sections I.B.; Section II; and atleast Examples 1 and 2. “Synthetic” means occurring or originatingnon-naturally, i.e., not naturally occurring.

As used herein, “corresponding residue position” refers to a residueposition in a protein sequence that corresponds to a given position inan OP-1 or other reference TGF-β family member amino acid sequence, whenthe two sequences are aligned. As will be appreciated by those skilledin the art and as illustrated in FIG. 1, the sequences of BMP familymembers are highly conserved in the C-terminal active domain, andparticularly in the finger 2 sub-domain. Amino acid sequence alignmentmethods and programs are well developed in the art. See, e.g., themethod of Needleman, et al. (1970) J. Mol. Biol. 48:443-453, implementedconveniently by computer programs such as the Align program (DNAstar,Inc.). Internal gaps and amino acid insertions in the second sequenceare ignored for purposes of calculating the alignment. For ease ofdescription, hOP-1 (human OP-1, also referred to in the art as “BMP-7”)is provided below as a representative osteogenic protein. It will beappreciated however, that OP-1 is merely representative of the TGF-βfamily of proteins.

As used herein, “TGF-β family member” or “TGF-β family protein,” means aprotein known to those of ordinary skill in the art as a member of theTGF-β superfamily. Structurally, such proteins are disulfide-linked homoor heterodimers that are expressed as large precursor polypeptide chainscontaining a hydrophobic signal sequence, an N-terminal pro region ofseveral hundred amino acids, and a mature domain comprising a variableN-terminal region and a more highly conserved C-terminal regioncontaining approximately 100 amino acids with a characteristic cysteinemotif having a conserved six or seven cysteine skeleton. Thesestructurally-related proteins have been identified as being involved ina variety of developmental events. TGF-β family members are typified byTGFβ1 and OP-1. Other TGF-β family proteins useful in the practice ofthe present invention include osteogenic proteins (as defined below),vg-1, DPP-C polypeptide, the hormones activin and inhibin, MIS, VGR-1and growth/differentiation factors GDF-1, GDF-3, GDF-9 and dorsalin-1.

It has been found that various members of the TGF-β protein superfamilymediate their activity by interaction with two different cell surfacereceptors, referred to as Type I and Type II receptors, to form ahetero-complex. The Type I and Type II receptors are bothserine/threonine kinases and share similar structures: an intracellulardomain that consists essentially of the kinase, and a short, extendedhydrophobic sequence sufficient to span the membrane one time, and anextracellular ligand-binding domain characterized by a highconcentration of conserved cysteines. The various Type I and Type IIreceptors have specific binding affinity with OP-1 and other morphogenicproteins, and their analogs, including the modified morphogens of thepresent invention.

“Osteogenic protein”, or “bone morphogenic protein,” means a TGF-βsuperfamily protein which can induce the full cascade of morphogenicevents culminating in skeletal tissue formation, including but notlimited to cartilage and/or endochondral bone formation. Osteogenicproteins useful herein include any known naturally-occurring nativeproteins including allelic, phylogenetic counterpart and other variantsthereof, whether naturally-occurring or biosynthetically produced (e.g.,including “muteins” or “mutant proteins”), as well as new,osteogenically active members of the general morphogenic family ofproteins. As described herein, this class of proteins is generallytypified by human osteogenic protein 1 (hOP-1). Other osteogenicproteins useful in the practice of the invention include osteogenicallyactive forms of proteins included within the list of: OP-1, OP-2, OP-3,BMP-2, BMP-3, BMP-4, BMP-5, BMP-6, BMP-9, DPP, Vg-1, Vgr, 60A protein,CDMP-1, CDMP-2, CDMP-3, GDF-1, GDF-3, GDF-5, 6, 7, MP-52, BMP-10,BMP-11, BMP-12, BMP-13, BMP-15, UNIVIN, NODAL, SCREW, ADMP or NEURAL,including amino acid sequence variants thereof, and/or heterodimersthereof. In one currently preferred embodiment, osteogenic proteinuseful in the practice of the invention includes any one of: OP-1,BMP-2, BMP4, BMP-12, BMP-13, GDF-5, GDF-6, GDF-7, CDMP-1, CDMP-2,CDMP-3, MP-52 and amino acid sequence variants and homologs thereof,including species homologs thereof. In still another preferredembodiment, useful osteogenically active proteins have polypeptidechains with amino acid sequences comprising a sequence encoded by anucleic acid that hybridizes, under low, medium or high stringencyhybridization conditions, to DNA or RNA encoding reference osteogenicsequences, e.g., C-terminal sequences defining the conserved sevencysteine domains of OP-1, OP-2, BMP-2, BMP-4, BMP-5, BMP-6, 60A, GDF-5,GDF-6, GDF-7 and the like. As used herein, high stringent hybridizationconditions are defined as hybridization according to known techniques in40% formamide, 5×SSPE, 5× Denhardt's Solution, and 0.1% SDS at 37° C.overnight, and washing in 0.1×SSPE, 0.1% SDS at 50° C. Standardstringency conditions are well characterized in commercially available,standard molecular cloning texts. See, for example, Molecular Cloning ALaboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (ColdSpring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II (D.N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984):Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); andB. Perbal, A Practical Guide To Molecular Cloning (1984); thedisclosures of the foregoing are incorporated by reference herein. Seealso, U.S. Pat. Nos. 5,750,651 and 5,863,758, the disclosures of whichare incorporated by reference herein.

Other members of the TGF-β superfamily of related proteins havingutility in the practice of the instant invention include native poorrefolder proteins among the list: TGF-β1, TGF-β2, TGF-β3, TGF-β4 andTGF-β5, various inhibins, activins, BMP-11, and MIS, to name a few. FIG.4 lists the C-terminal 35 residues defining the finger 2 subdomain ofvarious known members of the TGF-β superfamily. Any one of the proteinson the list that is a poor refolder can be improved by the methods ofthe invention, as can other known or discoverable family members. Asfurther described herein, the biologically active osteogenic proteinssuitable for use with the present invention can be identified by meansof routine experimentation using the art-recognized bioassay describedby Reddi and Sampath. A detailed description of useful proteins follows.Equivalents can be identified by the artisan using no more than routineexperimentation and ordinary skill.

“Morphogens” or “morphogenic proteins” as contemplated herein includesmembers of the TGF-β superfamily which have been recognized to bemorphogenic, i.e., capable of inducing the developmental cascade oftissue morphogenesis in a mature mammal (See PCT Application No. US92/01968). In particular, these morphogens are capable of inducing theproliferation of uncommitted progenitor cells, and inducing thedifferentiation of these stimulated progenitor cells in atissue-specific manner under appropriate environmental conditions. Inaddition, the morphogens are capable of supporting the growth andmaintenance of these differentiated cells. These morphogenic activitiesallow the proteins to initiate and maintain the developmental cascade oftissue morphogenesis in an appropriate, morphogenically permissiveenvironment, stimulating stem cells to proliferate and differentiate ina tissue-specific manner, and inducing the progression of events thatculminate in new tissue formation. These morphogenic activities alsoallow the proteins to induce the “redifferentiation” of cells previouslystimulated to stray from their differentiation path. Under appropriateenvironmental conditions it is anticipated that these morphogens alsomay stimulate the “redifferentiation” of committed cells. To guide theskilled artisan, described herein are numerous means for testingmorphogenic proteins in a variety of tissues and for a variety ofattributes typical of morphogenic proteins. It will be understood thatthese teachings can be used to assess morphogenic attributes of nativeproteins as well as modified proteins of the present invention.

Useful native or parent proteins of the present invention also includethose sharing at least 70% amino acid sequence homology within theC-terminal seven-cysteine domain of human OP-1. To determine the percenthomology of a candidate amino acid sequence to the conservedseven-cysteine domain, the candidate sequence and the seven cysteinedomain are aligned. The first step for performing an alignment is to usean alignment tool, such as the dynamic programming algorithm describedin Needleman et al., J. MOL. BIOL. 48: 443 (1970); the teachings ofwhich are incorporated by reference herein and the Align Program, acommercial software package produced by DNAstar, Inc. After the initialalignment is made, it is then refined by comparison to a multiplesequence alignment of a family of related proteins. Once the alignmentbetween the candidate sequence and the seven-cysteine domain is made andrefined, a percent homology score is calculated. The individual aminoacids of each sequence are compared sequentially according to theirsimilarity to each other. Similarity factors include similar size, shapeand electrical charge. One particularly preferred method of determiningamino acid similarities is the PAM250 matrix described in Dayhoff etal., 5 ATLAS OF PROTEIN SEQUENCE AND STRUCTURE 345-352 (1978 & Supp.),incorporated by reference herein. A similarity score is first calculatedas the sum of the aligned pairwise amino acid similarity scores.Insertions and deletions are ignored for the purposes of percenthomology and identity. Accordingly, gap penalties are not used in thiscalculation. The raw score is then normalized by dividing it by thegeometric mean of the scores of the candidate compound and the sevencysteine domain. The geometric mean is the square root of the product ofthese scores. The normalized raw score is the percent homology.

As used herein, “conservative substitutions” are residues that arephysically or functionally similar to the corresponding referenceresidues, e.g., that have similar size, shape, electric charge, chemicalproperties including the ability to form covalent or hydrogen bonds, orthe like. Particularly preferred conservative substitutions are thosefulfilling the criteria defined for an accepted point mutation inDayhoff et al. Ibid. Examples of conservative substitutions include thesubstitution of one amino acid for another with similar characteristics,e.g., substitutions within the following groups are well-known: (a)glycine, alanine; (b) valine, isoleucine, leucine; (c) aspartic acid,glutamic acid; (d) asparagine, glutamine; (e) serine, threonine; (f)lysine, arginine, histidine; and (g) phenylalanine, tyrosine. The term“conservative variant” or “conservative variation” also includes the useof a substituted amino acid in place of an unsubstituted parent aminoacid in a given polypeptide chain, provided that antibodies havingbinding specificity for the resulting substituted polypeptide chain alsohave binding specificity (i.e., “crossreact” or “immunoreact” with) theunsubstituted or parent polypeptide chain.

As used herein, a “conserved residue position” refers to a location in areference amino acid sequence occupied by the same amino acid or aconservative variant thereof in at least one other member sequence. Forexample, in FIG. 4, comparing BMP-2, BMP-4, BMP-5, and BMP-6 with OP-1as the reference sequence, positions 1, 5, 9, 12, 14, 15, 16, 17, 19,22, etc. are conserved positions, and residues 2, 3, 4, 6, 7, 8, 10, 11,13, 18, 20, 21, etc. are non-conserved positions.

As used herein, the “base” or “neck” region of the finger 2 sub-domainis defined by residues 1-10 and 22-35, as exemplified by OP-1, andcounting from the first residue following the cysteine doublet in theC-terminal active domain. (See FIG. 4). As is readily apparent from asequence alignment of other TGF-β protein family members with OP-1, thecorresponding base or neck region for a longer protein, such as BMP-9 orDorsalin, is defined by residues 1-10 and 23-36; for a shorter protein,such as NODAL, the corresponding region is defined by residues 1-10 and22-34 (See FIG. 4). In SEQ ID NO: 39, (human OP-1), the residuescorresponding to the base or neck region of the finger 2 subdomain areresidues 397-406 (corresponding to residues 1-10 in FIG. 4) and residues418-431 (corresponding to residues 22-35 in FIG. 4).

As used herein, “C-terminal active domain” refers to the conservedC-terminal region of mature TGF-β family proteins. The C-terminal activedomain contains approximately 100 amino acids with a characteristiccysteine motif having a six or seven cysteine skeleton. The cysteinepattern of the C-terminus of all of the proteins is in the identicalformat ending in the sequence Cys-X-Cys-X (Sporn and Roberts (1990),supra.)

As used herein, “amino acid sequence homology” includes both amino acidsequence identity and similarity. Homologous sequences share identicaland/or similar amino acid residues, where similar residues areconservative substitutions for, or “allowed point mutations” of,corresponding amino acid residues in an aligned reference sequence.

As used herein, the terms “chimeric protein”, “chimera”, “chimericpolypeptide chain”, “chimeric construct” and “chimeric mutant” refer toany BMP or TGF-β family member synthetic construct wherein the aminoacid sequence of at least one defined region, domain or sub-domain, suchas the finger 1, finger 2 or heel sub-domain, has been replaced in wholeor in part with an amino acid sequence from at least one other,different BMP or TGF-β family member protein, such that the resultingconstruct has an amino acid sequence recognizable as being derived fromthe different protein sources. Chimeric constructs also compriserecombinant fusion proteins in which the C-terminal active domain of onemorphogen is fused to the N-terminal domain of another morphogen.

As used herein, a “leader sequence” is any sequence of amino acidscorresponding to a sequence of nucleotides upstream, that is, positionedfarther to the C-terminal end, of the C-terminal active domain region ofa TGF-β family protein. Modifications in the leader sequence can alterrefolding properties, activity levels, solubility, control activation,and promote tissue-targeting as well as affinity-binding ability.

As used herein, useful expression host cells include prokaryotes andeukaryotes, including any host cell capable of making an inclusion body.Particularly useful host cells include, without limitation, bacterialhosts such as E. coli, as well as B. subtilis and Pseudomonas. Otheruseful hosts include lower eukaryotes, such as Saccharomyces cereviceaeor other yeast, and higher eukaryotes, such as Drosophila, CHO cells,and other mammalian cells, and the like. As discussed herein, chemicalsynthesis methods can also be utilized to generate the modified proteinsof the present invention.

In one aspect, the invention provides construction of recombinantproteins not readily expressed in mammalian cells, such as, for example,fusion proteins and the like. For example, a recombinant gene encoding afusion protein having bone targeting properties is constructed, whereina single sequence encodes both a BMP and an antibody binding site havingspecificity for a bone matrix protein such as osteocalcin orfibronectin. Similarly, a fusion protein can also be constructed to bindto cell surface receptors such as those on osteoprogenitor cells orchondrocytes. Other recombinant genes may encode for fusion proteinsthat specifically bind metals or other proteins. The specificity of thebinding would depend on the composition of the leader sequence that isadded to the BMP. These genes can be expressed in E. coli and refoldedin vitro.

In another embodiment, a cleavable fusion construct (cleavable byproteases—such as trypsin, V8, factor Xa and others, or chemically—withmild acid, hydroxylamine and other agents) is synthesized wherein theTGF-β protein is attached to a leader sequence that blocks activity. Instill another embodiment the activity of a TGF-β family member isrestored or enhanced by cleaving a portion or all of the leadersequence. By adding a cleavable leader sequence that inhibits activity,a latent form of the protein is created that can subsequently be cleavedto release a protein fragment comprising the active C-terminal domain.

In yet another embodiment, the leader sequence is also atissue-targeting sequence, such that release can be controlled to occurat the target site in vivo. The construction of the cleavage site canalso allow one to control the release of active protein. For example, inbone tissue a number of proteases involved in bone remodeling typicallyare present and can be used to advantage. A cleavable “hexa-his”, FBleader, or collagen binding sequence described below may be a suitableleader sequence for a latent form of the protein. By way of example, thetissue-targeting domain can be separated from a BMP by a leader sequencethat includes a run of at least three basic residues, which is known tobe cleaved in vivo.

In still another embodiment, the leader sequence can be constructed sothat the portion of the protein that is inhibiting specific activity iscleaved and activity restored, but the tissue-targeting portion of theprotein is retained.

In yet another preferred embodiment, the leader sequence of the TGF-βfamily protein is replaced by a leader sequence of another TGF-β member.The resultant “chimeric” protein may have altered solubility, foldingand/or tissue targeting activity, improved stability, and/or the abilityto bind to specific surfaces.

In another aspect of the invention, the fusion proteins are combinedwith other TGF-β family proteins to form heterodimers, wherein one canexploit the properties of each protein. For example, a fusion proteinwith tissue-targeting properties but no activity forms a heterodimerwith a different protein which has activity, but no tissue-targetingability. The former protein delivers the heterodimer to a target sitewhere the latter protein can perform its function.

In one aspect the invention provides biosynthetic BMPs and TGF-β familymember proteins having improved refolding properties under neutral orphysiological conditions. In one embodiment, the biosynthetic proteinsof the invention have improved refolding properties at a pH in the rangeof about 5.0-10.0, preferably in the range of about 6.0-9.0, morepreferably in the range of about 6.0-8.5, including in the range ofabout pH 7.0-7.5.

In another aspect the invention provides biosynthetic BMPs and TGF-βfamily member proteins having improved solubility properties underneutral or physiological conditions. In one embodiment, the biosyntheticproteins of the invention have improved solubility at a pH in the rangeof about 5.0-10.0, preferably in the range of about 6.0-9.0, morepreferably in the range of about 6.0-8.5, including in the range ofabout pH 7.0-7.5.

In still another aspect the invention provides biologically activebiosynthetic BMPs and TGF-β family member constructs competent to refoldunder physiological conditions and having altered isoelectric points ascompared with the parent sequence.

In another aspect, the invention provides a method for foldinghomodimers and heterodimers, which are poor refolders, underphysiological or neutral pH conditions. In one embodiment, the methodcomprises the steps of providing one or more solubilized TGF-β familyprotein constructs of the invention, exposing the solubilized protein toa redox reaction in a suitable refolding buffer, and allowing theprotein subunits to refold into homodimers and/or heterodimers, asdesired. In another embodiment, the modified TGF-β family proteins ofthe invention are not denatured prior to exposing them to the redoxreaction. In another embodiment, the redox reaction system can utilizeoxidized and reduced forms of glutathione, DTT, β-mercaptomethanol,cysteine and cystamine. In another embodiment, the redox reaction systemrelies on air oxidation, preferably in the presence of a metal catalyst,such as copper. In still another embodiment, these can be used as redoxsystems at ratios of reductant to oxidant of about 1:10 to about 10:1,preferably in the range of about 1:2 to 2:1. In another preferredembodiment, the protein is solubilized in the presence of a detergent,including an ionic detergent, a non-ionic detergent, e.g. digitonin, orzwitterionic detergents, such as3-[(3cholamidopropyl)dimethylammonio]-1-propanesulfate (CHAPS), orN-octyl glucoside. In still another embodiment, the refolding reactionoccurs in a pH range of about 5.0-10.0, preferably in the range of about6.0-9.0, more preferably in the range of about 7.0-8.5. In still anotherembodiment, the refolding reaction occurs at a temperature within therange of about 32-0° C., preferably in the range of about 25-4° C. Whereheterodimers are being created, optimal ratios for adding the twodifferent subunits readily can be determined empirically and withoutundue experimentation.

In another aspect, the invention provides methods for recombinantlyproducing poor refolder BMP and other TGF-β family member proteins in ahost cell, including a bacterial host, or any other host cell whereoverexpressed protein aggregates in a form that requires solubilizationand/or refolding in vitro. The method comprises the steps of providing ahost cell transfected with nucleic acid molecules encoding one or moreof the biosynthetic proteins of the invention, cultivating the hostcells under conditions suitable for expressing the biosynthetic protein,collecting the aggregated protein, and solubilizing and refolding theprotein using the steps outlined above. In another embodiment, themethod comprises the additional step of transfecting the host cell witha nucleic acid encoding the biosynthetic protein of the invention.

Modified morphogens of the invention may be used to form bone and/orcartilage in conjunction with a biocompatible matrix such as (but notlimited to) collagen, hydroxyapatite, ceramics, carboxymethylcellulose,and/or other carrier suitable or matrix material. Such combinations areparticularly useful in methods for regenerating bone, cartilage and/orother non-mineralized skeletal or connective tissues such as (but notlimited to) articular cartilage, fibrocartilage, ligament, tendon, jointcapsule, menisci, intervertebral disks, synovial membrane tissue,muscle, and fascia, to name but a few. See e.g. U.S. Pat. Nos.5,674,292, 5,840,325 and U.S. application Ser. No. 08/235,398, thedisclosures of which are incorporated by reference herein. The presentinvention contemplates that the binding and/or adherence properties tosuch matrix materials can be altered using the techniques disclosedherein for generating protein constructs. The modified proteins of theinvention may also be utilized to generate tendon, ligament and/ormuscle tissue.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A is a simplified line drawing useful in describing the structureof a monomeric subunit of a TGF-β superfamily member. See the Backgroundof the Invention, supra, for explanation. FIGS. 1B, 1C, and 1D aremonovision ribbon tracings of the respective peptide backbones oftypical secondary structures of the finger 1, heel, and finger 2regions.

FIGS. 2A and 2B are stereo peptide backbone ribbon trace drawingsillustrating the generic three-dimensional shape of TGF-β superfamilymember protein dimer: A) from the “top” (down the two-fold axis ofsymmetry between the subunits) with the axes of the helical heel regionsgenerally normal to the paper and the axes of each of the finger 1 andfinger 2 regions generally vertical, and B) from the “side” with thetwo-fold axis between the subunits in the plane of the paper, with theaxes of the heels generally horizontal, and the axes of the fingersgenerally vertical. The reader is encouraged to view the stereo alphacarbon trace drawings in wall eyed stereo to understand better thespatial relationships in the morphon design.

FIG. 3 is a pattern definition table prepared in accordance with theteaching of Smith and Smith (1990) Proc. Natl. Acad. Sci. USA87:118-122.

FIG. 4 lists the aligned C-terminal residues defining the finger 2sub-domain for various known members of the BMP family, and TGF-βsuperfamily of proteins, starting with the first residue following thecysteine doublet.

FIGS. 5A, 5B, and 5C are single letter code listings of amino acidsequences, arranged to indicate alignments and homologies of the finger1, heel, and finger 2 regions, respectively, of the currently knownmembers of the TGF-β superfamily. Shown are the respective amino acidscomprising each region of human TGF-β1 through TGF-β5 (the TGF-βsubgroup), the Vg/dpp subgroup consisting of dpp, Vg-1, Vgr-1, 60A (seecopending U.S. Ser. No. 08/271,556), BMP-2A (also known in theliterature as BMP-2), dorsalin, BMP-2B (also known in the literature asBMP-4), BMP-3, BMP-5, BMP-6, OP-1 (also known in the literature asBMP-7), OP-2 (see PCT/US91/07635 and U.S. Pat. No. 5,266,683) and OP-3(U.S. Ser. No. 07/971,091), the GDF subgroup consisting of GDF-1, GDF-3,and GDF-9, the Inhibin subgroup consisting of Inhibin α, Inhibin βA, andInhibin βB. The dashes (-) indicate a peptide bond between adjacentamino acids. A consensus sequence pattern for each subgroup is shown atthe bottom of each subgroup.

FIG. 6 is a single letter code listing of amino acid sequences,identified in capital letter in standard single letter amino acid code,and in lower case letters to identify groups of amino acids useful inthat location, wherein the lower case letters stand for the amino acidsindicated in accordance with the pattern definition key table set forthin FIG. 3. FIG. 6 identifies preferred pattern sequences forconstituting the finger 1, heel, and finger 2 regions of biosyntheticconstructs of the invention. The dashes (-) indicate a peptide bondbetween adjacent amino acids.

FIG. 7(A) shows the nucleotide and corresponding amino acid sequences ofH2487, a modified OP-1 comprising N-terminal decapeptide collagenbinding site inserted upstream of the seven-cysteine domain.

FIG. 7(B) shows the nucleotide and corresponding amino acid sequences ofH2440, a modified OP-1 comprising a hexa-histidine domain attached 35residues upstream of the first cysteine in the seven-cysteine domain.

FIG. 7(C) shows the nucleotide and amino acid sequences of H2521, amodified OP-1 comprising an FB leader domain of protein A attached 15residues upstream of the first cysteine in the seven-cysteine domain.

FIG. 7(D) shows the nucleotide and amino acid sequences of H2525, amodified OP-1 comprising both an FB leader domain of protein A and ahexa-histidine domain.

FIG. 7(E) shows the nucleotide and amino acid sequences of H2527, amodified OP-1 comprising an FB leader domain, a hexa-histidine domain,and an ASP-PRO acid cleavage site.

FIG. 7(F) shows the nucleotide and amino acid sequences of H2528, amodified CDMP-3 comprising an FB leader domain and a hexa-histidinedomain.

FIG. 7(G) shows the nucleotide and amino acid sequences of H2469, amodified OP-1 (truncated) comprising 14 original residues upstream ofthe first cysteine in the conserved seven-cysteine domain.

FIG. 7(H) shows the nucleotide and amino acid sequences of H2510, amodified OP-1 comprising a collagen binding site inserted 7 residuesupstream of the first cysteine in the conserved seven-cysteine domain.

FIG. 7(I) shows the nucleotide and amino acid sequences of H2523, amodified OP-1 comprising a collagen peptide and a spacer added 13residues upstream from the first cysteine in the conservedseven-cysteine domain.

FIG. 7(J) shows the nucleotide and amino acid sequences of H2524, amodified OP-1 comprising a hexa-histideine domain, a collagen peptideand a spacer added 13 residues upstream from the first cysteine in theconserved seven-cysteine domain.

FIG. 8 is a restriction map encoding the OP-1 C-terminal seven cysteineactive domain;

FIG. 9(A) is a schematic representation of various biosynthetic chimericBMP constructs;

FIG. 9(B) is a schematic representation of biosynthetic BMP mutants andtheir refolding and ROS activity;

FIG. 10 shows the number of charged residues in the C-terminalsub-domains for various BMPs.

FIG. 11 is a graph of ROS activity for OP-1 (standard), the mutant H2549protein and H2549 treated with trypsin, plotted as concentration (ng/mL)vs. optical density (at 405 nm).

FIG. 12 is a graph of ROS activity for OP-1 (standard) and variousfractions of the mutant H2223 protein and the trypsin truncated form ofthis protein, plotted as concentration (ng/mL) vs. optical density (at405 nm).

FIG. 13(A) is a graph of ROS activity for OP-1 homodimer (from CHOcells), BMP-2 homodimer and hexa-his OP-1 heterodimer, plotted asconcentration (ng/mL) vs. optical density (405 nm).

FIG. 13(B) is a graph of ROS activity for OP-1 homodimer (from CHOcells), hexa-his OP-1/BMP-2 heterodimer and hexa-his OP-1, plotted asconcentration (ng/mL) vs. optical density (405 nm).

FIG. 14 is a graph of ROS activity for OP-1 (standard), BMP-2 mutantH2142 protein homodimer, mutant H2525 protein homodimer and H2525/2142heterodimer, plotted as concentration (ng/mL) vs. optical density (405nm).

FIG. 15 shows the amino acid sequences for the finger 2 subdomain ofvarious OP-1 mutants and their folding efficiencies and biologicalactivities in the ROS cell based alkaline phosphotase assay.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The present invention provides modified forms of TGF-β family proteinswhich have altered refolding properties, and altered activity profilescompared to natural forms. Modified proteins of the invention compriseN-terminal modifications of naturally-occurring TGF-β family members,especially morphogenic proteins. These modifications include extension,truncation, and/or activation by protease or chemical cleavage atspecific sites (e.g., by acid or CNBr), attachment (fusion) of distinctprotein domains and production of heterodimers with subunits from otherTGF-β family members. The detailed description provided below describesan exemplary array of substitutions, fusions, and extensions that resultin improved activity and pharmaceutical properties. Methods of producingmodified proteins are also taught.

According to one aspect of the invention, the folding capabilities ofpoor refolder BMPs and other members of the TGF-β superfamily ofproteins, including heterodimers and chimeras thereof, are improved byfusing specific targeting and receptor-binding regions to the existingN-terminal domain of BMP or TGF-β family members, which can then becleaved at sites within the fusion protein. As a result of thisdiscovery, it is possible to design BMP and other TGF-β family proteinsthat (1) are expressed recombinantly in prokaryotic or eukaryotic cellsor synthesized using polypeptide synthesizers; (2) have altered foldingcapabilities; (3) have altered solubility under neutral pHs, includingbut not limited to physiological conditions; (4) have alteredisoelectric points; (5) have altered stability; (6) have altered bindingor adherence properties to solid surfaces (e.g., biocompatible matricesor metals); and/or (7) have a desired, altered biological activity, suchas tissue and/or receptor specificity. In addition, the inventionprovides means for testing new candidate constructs rapidly,particularly a biological or biochemical property of the candidate. Theinvention also provides means for rapidly mapping epitopes ofantibodies, for example by making chimeric proteins with differentcombinations of domains. Specifically, making use of the discoveriesdisclosed herein, morphogen sequences which otherwise could not beexpressed in a prokaryotic host such as E. coli now can be modified toallow expression in E. coli and refolding in vitro.

Thus, the present invention can provide mechanisms for designingquick-release, slow-release and/or timed-release formulations containinga preferred chimeric protein. In addition, the present inventionprovides mechanisms for designing formulations engineered forenvironmentally-triggered release of a protein construct. That is,modified proteins can be designed to modulate delivery and facilitaterelease and activity under particular environmental conditions in situ,such as changes in pH, presence of a specific protease, etc. Otheradvantages and features will be evident from the teachings below.Moreover, making use of the discoveries disclosed herein, modifiedproteins having altered surface-binding/surface-adherent properties canbe designed and selected. Surfaces of particular significance include,but are not limited to, solid surfaces which can be naturally-occurringsuch as bone; or porous particulate surfaces such as collagen or otherbiocompatible matrices; or the fabricated surfaces of prostheticimplants, including metals. As contemplated herein, virtually anysurface can be assayed for differential binding of constructs. Thus, thepresent invention embraces a diversity of functional molecules havingalterations in their surface-binding/surface-adherent properties,thereby rendering such constructs useful for altered in vivoapplications, including slow-release, fast-release and/or timed-releaseformulations.

The skilled artisan will appreciate that mixing-and-matching any one ormore the above-recited attributes provides specific opportunities tomanipulate the uses of customized modified proteins (and DNAs encodingthe same). For example, the attribute of altered stability can beexploited to manipulate the turnover of a protein in vivo. Moreover, inthe case of modified proteins also having attributes such as alteredre-folding and/or function, there is likely an interconnection betweenfolding, function and stability. See, for example, Lipscomb et al., 7Protein Sci. 765-73 (1998); and Nikolova et al., 95 Proc. Natl. Acad.Sci. USA 14675-80 (1998). For purposes of the present invention,stability alterations can be routinely monitored using well-knowntechniques of circular dichroism and other indices of stability as afunction of denaturant concentration or temperature. One can also useroutine scanning calorimetry. Similarly, there is likely aninterconnection between any of the foregoing attributes and theattribute of solubility. In the case of solubility, it is possible tomanipulate this attribute so that a modified protein is either more orless soluble under physiologically-compatible conditions and itconsequently diffuses readily or remains localized, respectively, whenadministered in vivo.

Provided below are detailed descriptions of suitable biosyntheticproteins and methods useful in the practice of the invention, as well asmethods for using and testing these proteins; and numerous, nonlimitingexamples which 1) illustrate the suitability of the biosyntheticproteins and methods described herein; and 2) provide assays with whichto test and use these proteins.

I. Protein Considerations

A. Structural Features TGF-β2 and OP-1.

Each of the subunits in either TGF β2 or OP-1 have a characteristicfolding pattern, illustrated schematically in FIG. 1A, that involves sixof the seven C-terminal cysteine residues. Briefly, four of the cysteineresidues in each subunit form two disulfide bonds which together createan eight residue ring, while two additional cysteine residues form adisulfide bond that passes through the ring to form a knot-likestructure. With a numbering scheme beginning with the most N-terminalcysteine of the 7 conserved cysteine residues assigned number 1, the 2ndand 6th cysteine residues are disulfide bonded to close one side of theeight residue ring while the 3rd and 7th cysteine residues are disulfidebonded to close the other side of the ring. The 1st and 5th conservedcysteine residues are disulfide bonded through the center of the ring toform the core of the knot. Amino acid sequence alignment patternssuggest this structural motif is conserved between members of the TGF-βsuperfamily. The 4th cysteine is semi-conserved and when presenttypically forms an interchain disulfide bond (ICDB) with thecorresponding cysteine residue in the other subunit.

The structure of each subunit in TGF-β2 and OP-1 comprise three majortertiary structural elements and an N-terminal region. The structuralelements are made up of regions of contiguous polypeptide chain thatpossess over 50% secondary structure of the following types: (1) loop,(2) α-helix and (3) β-sheet. Another defining criterion for eachstructural region is that the entering (N-terminal) and exiting(C-terminal) peptide strands are fairly close together, being about 7 Aapart.

The amino acid sequence between the 1st and 2nd conserved cysteines, asshown in FIG. 1A, forms a structural region characterized by ananti-parallel β-sheet finger referred to herein as the finger 1 region.Similarly the residues between the 5th and 6th conserved cysteines, asshown in FIG. 1A, also form an anti-parallel β-sheet finger, referred toherein as the finger 2 region. A β-sheet finger is a single amino acidchain, comprising a β-strand that folds back on itself by means of aβ-turn or some larger loop so that the polypeptide chain entering andexiting the region form one or more anti-parallel β-sheet structures.The third major structural region, involving the residues between the3rd and 5th conserved cysteines, as shown in FIG. 1A, is characterizedby a three turn α-helix, referred to herein as the heel region. Theorganization of the monomer structure is similar to that of a left handwhere the knot region is located at the position equivalent to the palm,the finger 1 region is equivalent to the index and middle fingers, theα-helix, or heel region, is equivalent to the heel of the hand, and thefinger 2 region is equivalent to the ring and small fingers. TheN-terminal region, whose sequence is not conserved across the TGF-βsuperfamily, is predicted to be located at a position roughly equivalentto the thumb.

Monovision ribbon tracings of the alpha carbon backbones of each of thethree major independent structural elements of the TGF-β2 monomer areillustrated in FIGS. 1B-1D. Specifically, an exemplary finger 1 regioncomprising the first anti-parallel β-sheet segment is shown in FIG. 1B,an exemplary heel region comprising the three turn α-helical segment isshown in FIG. 1C, and an exemplary finger 2 region comprising second andthird anti-parallel β-sheet segments is shown in FIG. 1D.

FIG. 2 shows stereo ribbon trace drawings of the peptide backbone of theconformationally active TGF-β2 dimer complex. The two monomer subunitsin the dimer complex are oriented with two-fold rotational symmetry suchthat the heel region of one subunit contacts the finger regions of theother subunit with the knot regions of the connected subunits formingthe core of the molecule. The 4th cysteine forms an interchain disulfidebond with its counterpart on the second chain thereby equivalentlylinking the chains at the center of the palms. The dimer thus formed isan ellipsoidal (cigar shaped) molecule when viewed from the top lookingdown the two-fold axis of symmetry between the subunits (FIG. 2A).Viewed from the side, the molecule resembles a bent “cigar” since thetwo subunits are oriented at a slight angle relative to each other (FIG.2B).

As shown in FIG. 2, each of the structural elements which togetherdefine the native monomer subunits of the dimer are labeled 22, 22′, 23,23′, 24, 24′, 25, 25′, 26, and 26′, wherein, elements 22, 23, 24, 25,and 26 are defined by one subunit and elements 22′, 23′, 24′, 25′, and26′ belong to the other subunit. Specifically, 22 and 22′ denoteN-terminal domains; 23 and 23′ denote the finger 1 regions; 24 and 24′denote heel regions; 25 and 25′ denote the finger 2 regions; and 26 and26′ denote disulfide bonds which connect the 1st and 5th conservedcysteines of each subunit to form the knot-like structure. From FIG. 2,it can be seen that the heel region from one subunit, e.g., 24, and thefinger 1 and finger 2 regions, e.g., 23′ and 25′, respectively from theother subunit, interact with one another. These three elementsco-operate with one other to define a structure interactive with, andcomplimentary to the ligand binding interactive surface of the cognatereceptor.

(1) Selection of Finger and Heel Regions

It is contemplated that the amino acid sequences defining the finger andheel regions may be utilized from the respective finger and heel regionsequences of any known member of the TGF-β superfamily, identifiedherein, or from amino acid sequences of a new superfamily memberdiscovered hereafter.

FIG. 5 summarizes the amino acid sequences of currently identified TGF-βsuperfamily members aligned into finger 1 (FIG. 5A), heel (FIG. 5B) andfinger 2 (FIG. 5C) regions. The sequences were aligned by a computeralgorithm which in order to optimally align the sequences inserted gapsinto regions of amino acid sequence known to define loop structuresrather than regions of amino acid sequence known to have conserved aminoacid sequence or secondary structure. For example, if possible, no gapswere introduced into amino acid sequences of finger 1 and finger 2regions defined by β sheet or heel regions defined by a helix. Thedashes (-) indicate a peptide bond between adjacent amino acids. Aconsensus sequence pattern for each subgroup is shown at the bottom ofeach subgroup.

After the amino acid sequences of each of the TGF-β superfamily memberswere aligned, the aligned sequences were used to produce amino acidsequence alignment patterns which identify amino acid residues that maybe substituted by another amino acid or group of amino acids withoutaltering the overall tertiary structure of the resulting construct. Theamino acids or groups of amino acids that may be useful at a particularposition in the finger and heel regions were identified by a computeralgorithm implementing the amino acid hierarchy pattern structure shownin FIG. 3.

Briefly, the algorithm performs four levels of analysis. In level I, thealgorithm determines whether a particular amino acid residue occurs witha frequency greater than 75% at a specific position within the aminoacid sequence. For example, if a glycine residue occurs 8 out of 10times at a particular position in an amino acid sequence, then a glycineis designated at that position. If the position to be tested consists ofall gaps then a gap character (−) is assigned to the position,otherwise, if at least one gap exists then a “z” (standing for anyresidue or a gap) is assigned to the position. If, no amino acid occursin 75% of the candidate sequences at a particular position the algorithmimplements the Level II analysis.

Level II defines pattern sets a, b, d, l, k, o, n, i, and h, wherein l,k, and o share a common amino acid residue. The algorithm thendetermines whether 75% or more of the amino acid residues at aparticular position in the amino acid sequence satisfy one of theaforementioned patterns. If so, then the pattern is assigned to thatposition. It is possible, however, that both patterns l and k may besimultaneously satisfied because they share the same amino acid,specifically aspartic acid. If simultaneous assignment of l and k occursthen pattern m (Level III) is assigned to that position. Likewise, it ispossible that both patterns k and o may be simultaneously assignedbecause they share the same amino acid, specifically glutamic acid. Ifsimultaneous assignment of k and o occurs, then pattern q (Level III) isassigned to that position. If neither a Level II pattern nor the LevelIII patterns, m and q, satisfy a particular position in the amino acidsequence then the algorithm implements a Level III analysis.

Level III defines pattern sets c, e, m, q, p, and j, wherein m, q, and pshare a common amino residue. Pattern q, however, is not tested in theLevel III analysis. It is possible that both patterns m and p may besimultaneously satisfied because they share the same amino acid,specifically, glutamic acid. If simultaneous assignment of m and poccurs then pattern r (Level IV) is assigned to that position. If 75% ofthe amino acids at a pre-selected position in the aligned amino acidsequences satisfy a Level III pattern, then the Level III pattern isassigned to that position. If a Level III pattern cannot be assigned tothat position then the algorithm implements a Level IV analysis.

Level IV comprises two non-overlapping patterns f and r. If 75% of theamino acids at a particular position in the amino acid sequence satisfya Level IV pattern then the pattern is assigned to the position. If noLevel IV pattern is assigned the algorithm assigns an X representing anyamino acid (Level V) to that position.

In FIG. 3, Level I lists in upper case letters in single amino acid codethe 20 naturally occurring amino acids. Levels II-V define, in lowercase letters, groups of amino acids based upon the amino acid hierarchyset forth in Smith et al., supra. The amino acid sequences set forth inFIGS. 5 and 6 were aligned using the aforementioned computer algorithms.

It is contemplated that if the artisan wishes to produce a morphonconstruct based upon currently identified members of the TGF-βsuperfamily, then the artisan may use the amino acid sequences shown inFIG. 5 to provide the finger 1, finger 2 and heel regions useful in theproduction of the morphon constructs of the invention. In the case ofmembers of the TGF-β superfamily discovered hereafter, the amino acidsequence of the new member may be aligned, either manually or by meansof a computer algorithm, with the sequences set forth in FIG. 5 todefine heel and finger regions useful in the practice of the invention.

Table 1 below summarizes publications which describe the amino acidsequences of each TGF-β superfamily member that were used to produce thesequence alignment patterns set forth in FIGS. 5 and 6. TABLE 1 TGF-βSuperfamily Member SEQ. ID. No. Publication TGF-β1 40 Derynck et al.(1987) Nucl. Acids. Res. 15: 3187 TGF-β2 41 Burt et al. (1991) DNA CellBiol. 10: 723-734 TGF-β3 42 Ten Dijke et al. (1988) Proc. Natl. Acad.Sci. USA 85: 4715-4719; Derynck et al. (1988) EMBO J. 7: 3737-3743.TGF-β4 43 Burt et al. (1992) Mol. Endcrinol. 6: 989-922. TGF-β5 44Kondaiah et al. (1990) J. Biol. Chem 265: 1089-1093 dpp 45 Padgett etal. (1987) Nature 325: 81-84; Paganiban et al. (1990) Mol. Cell Biol.10: 2669-2677. vg-1 46 Weeks et al. (1987) Cell 51: 861-867 vgr-1 47Lyons et al. (1989) Proc. Natl. Acad. Sci USA 86: 4554-4558 60A 48Wharton et al. (1991) Proc. Natl. Acad. Sci. USA 88: 9214-9218; Doctoret al. (1992) Dev. Biol. 151: 491-505 BMP-2A 49 Wozney et al. (1988)Science 242: 1528-1534 BMP-3 50 Wozney et al. (1988) Science 242:1528-1534 BMP-4 51 Wozney et al. (1988) Science 242: 1528-1534 BMP-5 52Celeste et al. (1990) Proc. Natl. Acad. Sci. USA 87: 9843-9847 BMP-6 53Celeste et al. (1990) Proc. Natl. Acad. Sci. USA 87: 9843-9847 Dorsalin54 Basler et al. (1993) Cell 73: 687-702 OP-1 55 Celeste et al. (1990)Proc. Natl. Acad. Sci. USA 87: 9843-9847; Ozkaynak et al. (1990) EMBO J.9: 2085-2093 OP-2 56 Ozkaynak et al. (1992) J. Biol. Chem. 267:25220-25227 OP-3 57 Ozkaynak et al. PCT/WO94/10203 Seq. I.D. No. 1.GDF-1 58 Lee (1990) Mol. Endocrinol. 4: 1034-1040 GDF-3 59 McPherron etal. (1993) J. Biol. Chem. 268: 3444-3449 GDF-9 60 McPherron et al.(1993) J. Biol. Chem. 268: 3444-3449 Inhibin α 61 Mayo et al. (1986)Proc. Natl. Acad. Sci. USA 83: 5849-5853; Stewart et al. (1986) FEBSLett 206: 329-334; Mason et al. (1986) Biochem. Biophys. Res. Commun.135: 957-964 Inhibin βA 62 Forage et al. (1986) Proc. Natl. Acad. Sci.USA 83: 3091-3095; Chertov et al. (1990) Biomed. Sci. 1: 499-506 InhibinβB 63 Mason et al. (1986) Biochem. Biophys. Res. Commun. 135: 957-964

The invention further contemplates the use of corresponding finger 1subdomain sequences from the well-known proteins: GDF-5, GDF-7 (asdisclosed in U.S. Pat. No. 5,801,014, the entire disclosure of which isincorporated herein by reference); GDF-6 (as disclosed in U.S. Pat. No.5,770,444, the entire disclosure of which is incorporated herein byreference); and BMP-12 and BMP-13 (as disclosed in U.S. Pat. No.5,658,882, the entire disclosure of which is incorporated herein byreference).

In particular, it is contemplated that amino acid sequences definingfinger 1 regions useful in the practice of the instant inventioncorrespond to the amino acid sequence defining a finger 1 region for anyTGF-β superfamily member identified herein. The finger 1 subdomain canconfer at least biological and/or functional attribute(s) which arecharacteristic of the native protein. Useful intact finger 1 regionsinclude, but are not limited to TGF-β1 SEQ. ID. No. 40, residues 2through 29, TGF-β2 SEQ. ID. No. 41, residues 2 through 29, TGF-β3 SEQ.ID. No. 42, residues 2 through 29, TGF-β4 SEQ. ID. No. 43, residues 2through 29, TGF-β5 SEQ. ID. No. 44, residues 2 through 29, dpp SEQ. ID.No. 45, residues 2 through 29, Vg-1 SEQ. ID. No. 46, residues 2 through29, Vgr-1 SEQ. ID. No. 47, residues 2 through 29, 60A SEQ. ID. No. 48,residues 2 through 29, BMP-2A SEQ. ID. No. 49, residues 2 through 29,BMP-3 SEQ. ID. No. 50, residues 2 through 29, BMP-4 SEQ. ID. No. 51,residues 2 through 29, BMP-5 SEQ. ID. No. 52, residues 2 through 29,BMP-6 SEQ. ID. No. 53, residues 2 through 29, Dorsalin SEQ. ID. No. 54,residues 2 through 29, OP-1 SEQ. ID. No. 55, residues 2 through 29, OP-2SEQ. ID. No. 56, residues 2 through 29, OP-3 SEQ. ID. No. 57, residues 2through 29, GDF-1 SEQ. ID. No. 58, residues 2 through 29, GDF-3 SEQ. ID.No. 59, residues 2 through 29, GDF-9 SEQ. ID. No. 60, residues 2 through29, Inhibin α SEQ. ID. No. 61, residues 2 through 29, Inhibin βA SEQ.ID. No. 62, residues 2 through 29, Inhibin βB SEQ. ID. No. 63, residues2 through 29, CDMP-1/GDF-5 SEQ. ID. No. 83, residues 2 through 29,CDMP-2/GDF-6 SEQ. ID. No. 84, residues 2 through 29, GDF-6 (murine) SEQ.ID. No. 85, residues 2 through 29, CDMP-2 (bovine) SEQ. ID. No. 86,residues 2 through 29, and GDF-7 (murine) SEQ. ID. No. 87, residues 2through 29.

The invention further contemplates the use of corresponding heelsubdomain sequences from the well-known proteins BMP-12 and BMP-13 (asdisclosed in U.S. Pat. No. 5,658,882, the entire disclosure of which isincorporated herein by reference).

It is contemplated also that amino acid sequences defining heel regionsuseful in the practice of the instant invention correspond to the aminoacid sequence defining an intact heel region for any TGF-β superfamilymember identified herein. The heel region can at least influenceattributes of the native protein, including functional and/or foldingattributes. Useful intact heel regions may include, but are not limitedto TGF-β1 SEQ. ID. No. 40, residues 35 through 62, TGF-β2 SEQ. ID. No.41, residues 35 through 62, TGF-β3 SEQ. ID. No. 42, residues 35 through62, TGF-β4 SEQ. ID. No. 43, residues 35 through 62, TGF-β5 SEQ. ID. No.44, residues 35 through 62, dpp SEQ. ID. No. 45, residues 35 through 65,Vg-1 SEQ. ID. No. 46, residues 35 through 65, Vgr-1 SEQ. ID. No. 47,residues 35 through 65, 60A SEQ. ID. No. 48, residues 35 through 65,BMP-2 SEQ. ID. No. 49, residues 35 through 64, BMP3 SEQ. ID. No. 50,residues 35 through 66, BMP-4 SEQ. ID. No. 51, residues 35 through 64,BMP-5 SEQ. ID. No. 52, residues 35 through 65, BMP-6 SEQ. ID. No. 53,residues 35 through 65, Dorsalin SEQ. ID. No. 54, residues 35 through65, OP-1 SEQ. ID. No. 55, residues 35 through 65, OP-2 SEQ. ID. No. 56,residues 35 through 65, OP-3 SEQ. ID. No. 57, residues 35 through 65,GDF-1 SEQ. ID. No. 58, residues 35 through 70, GDF-3 SEQ. ID. No. 59,residues 35 through 64, GDF-9 SEQ. ID. No. 60, residues 35 through 65,Inhibin α SEQ. ID. No. 61, residues 35 through 65, Inhibin βA SEQ. ID.No. 62, residues 35 through 69, Inhibin βB SEQ. ID. No. 63, residues 35through 68, CDMP-1/GDF-5 SEQ. ID. No. 83, residues 35 through 65,CDMP-2/GDF-6 SEQ. ID. No. 84, residues 35 through 65, GDF-6 (murine)SEQ. ID. No. 85, residues 35 through 65, CDMP-2 (bovine) SEQ. ID. No.86, residues 35 through 65, and GDF-7 (murine) SEQ. ID. No. 87, residues35 through 65.

The invention further contemplates the use of corresponding finger 2subdomain sequences from the well-known proteins BMP-12 and BMP-13 (asdisclosed in U.S. Pat. No. 5,658,882, the entire disclosure of which isincorporated herein by reference).

It is contemplated also that amino acid sequences defining finger 2regions useful in the practice of the instant invention correspond tothe amino acid sequence defining an intact finger 2 region for any TGF-βsuperfamily member identified herein. The finger 2 subdomain can conferat least folding attribute(s) which are characteristic of the nativeprotein. Useful intact finger 2 regions may include, but are not limitedto TGF-β1 SEQ. ID. No. 40, residues 65 through 94, TGF-β2 SEQ. ID. No.41, residues 65 through 94, TGF-β3 SEQ. ID. No. 42, residues 65 through94, TGF-β4 SEQ. ID. No. 43, residues 65 through 94, TGF-β5 SEQ. ID. No.44, residues 65 through 94, dpp SEQ. ID. No. 45, residues 68 through 98,Vg-1 SEQ. ID. No. 46, residues 68 through 98, Vgr-1 SEQ. ID. No. 47,residues 68 through 98, 60A SEQ. ID. No. 48, residues 68 through 98,BMP-2A SEQ. ID. No. 49, residues 67 through 97, BMP-3 SEQ. ID. No. 50,residues 69 through 99, BMP-4 SEQ. ID. No. 51, residues 67 through 97,BMP-5 SEQ. ID. No. 52, residues 68 through 98, BMP-6 SEQ. ID. No. 53,residues 68 through 98, Dorsalin SEQ. ID. No. 54, residues 68 through99, OP-1 SEQ. ID. No. 55, residues 68 through 98, OP-2 SEQ. ID. No. 56,residues 68 through 98, OP-3 SEQ. ID. No. 57, residues 68 through 98,GDF-1 SEQ. ID. No. 58, residues 73 through 103, GDF-3 SEQ. ID. No. 59,residues 67 through 97, GDF-9 SEQ. ID. No. 60, residues 68 through 98,Inhibin α SEQ. ID. No. 61, residues 68 through 101, Inhibin βA SEQ. ID.No. 62, residues 72 through 102, Inhibin βB SEQ. ID. No. 63, residues 71through 101, CDMP-1/GDF-5 SEQ. ID. No. 83, residues 68 through 98,CDMP-2/GDF-6 SEQ. ID. No. 84, residues 68 through 98, GDF-6 (murine)SEQ. ID. No. 85, residues 68 through 98, CDMP-2 (bovine) SEQ. ID. No.86, residues 68 through 98, and GDF-7 (murine) SEQ. ID. No. 87, residues68 through 98.

In addition, it is contemplated that the amino acid sequences of therespective finger and heel regions can be altered by amino acidsubstitution, for example by exploiting substitute residues as disclosedherein or selected in accordance with the principles disclosed in Smithet al. (1990), supra. Briefly, Smith et al. disclose an amino acid classhierarchy similar to the one summarized in FIG. 3, which can be used torationally substitute one amino acid for another while minimizing grossconformational distortions of the type which could compromise proteinfunction. In any event, it is contemplated that many synthetic firstfinger, second finger, and heel region sequences, having only 70%homology with natural regions, preferably 80%, and most preferably atleast 90%, can be used to produce the constructs of the presentinvention. Amino acid sequence patterns showing amino acids preferred ateach location in the finger and heel regions, deduced in accordance withthe principles described in Smith et al. (1990) supra, also are show inFIGS. 5 and 6, and are referred to as the: TGF-β; Vg/dpp; GDF; andInhibin subgroup patterns. The amino acid sequences defining the finger1, heel and finger 2 sequence patterns of each subgroup are set forth inFIGS. 5A, 5B, and 5C, respectively. In addition, the amino acidsequences defining the entire TGF-β, Vg/dpp, GDF and Inhibin subgrouppatterns are set forth in the Sequence Listing as SEQ. ID. Nos. 64, 65,66, and 67, respectively.

The preferred amino acid sequence patterns for each subgroup, disclosedin FIGS. 5A, 5B, and 5C, and summarized in FIG. 6, enable one skilled inthe art to identify alternative amino acids that may be incorporated atspecific positions in the finger 1, heel, and finger 2 elements. Theamino acids identified in upper case letters in a single letter aminoacid code identify conserved amino acids that together are believed todefine structural and functional elements of the finger and heelregions. The upper case letter “X” in FIGS. 5 and 6 indicates that anynaturally occurring amino acid is acceptable at that position. The lowercase letter “z” in FIGS. 5 and 6 indicates that either a gap or any ofthe naturally occurring amino acids is acceptable at that position. Thelower case letters stand for the amino acids indicated in accordancewith the pattern definition table set forth in FIG. 5 and identifygroups of amino acids which are useful in that location.

In accordance the amino acid sequence subgroup patterns set forth inFIGS. 5 and 6, it is contemplated, for example, that the skilled artisanmay be able to predict that where applicable, one amino acid may besubstituted by another without inducing disruptive stereochemicalchanges within the resulting protein construct. For example, in FIG. 5A,in the TGF-β subgroup pattern at residue number 12 it is contemplatedthat either a lysine residue (K) or a glutamine residue (Q) may bepresent at this position without affecting the structure of theresulting construct. Accordingly, the sequence pattern at position 12contains an “n” which in accordance with FIG. 10 defines an amino acidresidue selected from the group consisting of lysine or glutamine. It iscontemplated, therefore, that many synthetic finger 1, finger 2 and heelregion amino acid sequences, having 70% homology, preferably 80%, andmost preferably at least 90% with the natural regions, may be used toproduce conformationally active proteins of the invention.

In accordance with these principles, it is contemplated that one maydesign a synthetic construct by starting with the amino acid sequencepatterns belonging to the TGF-β, Vg/dpp, GDF, or Inhibin subgrouppatterns shown in FIGS. 5 and 6. Thereafter, by using conventionalrecombinant or synthetic methodologies a preselected amino acid may besubstituted by another as guided by the principles herein and theresulting protein construct tested for binding activity in combinationwith either agonist or antagonist activity.

The TGF-β subgroup pattern, SEQ. ID. No. 64, accommodates the homologiesshared among members of the TGF-β subgroup identified to date includingTGF-β1, TGF-β2, TGF-β3, TGF-β4, and TGF-β5. The generic sequence, shownbelow, includes both the conserved amino acids (standard three lettercode) as well as alternative amino acids (Xaa) present at the variablepositions within the sequence and defined by the rules set forth in FIG.3. TGF-β Subgroup Pattern Cys Cys Val Arg Pro Leu Tyr Ile Asp Phe ArgXaa Asp Leu Gly Trp1               5                   10                  15 Lys Trp IleHis Glu Pro Lys Gly Tyr Xaa Ala Asn Phe Cys Xaa Gly            20                  25                  30 Xaa Cys Pro TyrXaa Trp Ser Xaa Asp Thr Gln Xaa Ser Xaa Val Leu        35                  40                  45 Xaa Leu Tyr Asn XaaXaa Asn Pro Xaa Ala Ser Ala Xaa Pro Cys Cys    50                  55                  60 Val Pro Gln Xaa Leu GluPro Leu Xaa Ile Xaa Tyr Tyr Val Gly Arg65                  70                  75                  80 Xaa XaaLys Val Glu Gln Leu Ser Asn Met Xaa Val Xaa Ser Cys Lys                85                  90                  95 Cys Ser.

Each Xaa can be independently selected from a group of one or morespecified amino acids defined as follows, wherein: Xaa12 is Arg or Lys;Xaa26 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa31 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa33 is Ala, Gly, Pro, Ser, or Thr; Xaa37 is Ile, Leu, Metor Val; Xaa40 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa44 is His, Phe, Trp orTyr; Xaa46 is Arg or Lys; Xaa49 is Ala, Gly, Pro, Ser, or Thr; Xaa53 isArg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa54 is Ala, Arg, Asn,Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr or Val; Xaa57 is Ala, Arg, Asn, Asp, Cys, Glu, Gin, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa61 is Ala,Gly, Pro, Ser, or Thr; Xaa68 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa73 isAla, Gly, Pro, Ser, or Thr; Xaa75 is Ile, Leu, Met or Val; Xaa81 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa82 is Ala, Gly, Pro, Ser,or Thr; Xaa91 is Ile or Val; Xaa93 is Arg or Lys.

The Vg/dpp subgroup pattern, SEQ. ID. No. 65, accommodates thehomologies shared among members of the Vg/dpp subgroup identified todate including dpp, vg-1, vgr-1, 60A, BMP-2A (BMP-2), Dorsalin, BMP-2B(BMP-4), BMP-3, BMP-5, BMP-6, OP-1 (BMP-7), OP-2 and OP-3. The genericsequence, below, includes both the conserved amino acids (standard threeletter code) as well as alternative amino acids (Xaa) present at thevariable positions within the sequence and defined by the rules setforth in FIG. 3. Vg/dpp Subgroup Pattern Cys Xaa Xaa Xaa Xaa Leu Tyr ValXaa Phe Xaa Asp Xaa Gly Trp Xaa1               5                   10                  15 Asp Trp IleIle Ala Pro Xaa Gly Tyr Xaa Ala Xaa Tyr Cys Xaa Gly            20                  25                  30 Xaa Cys Xaa PhePro Leu Xaa Xaa Xaa Xaa Asn Xaa Thr Asn His Ala        35                  40                  45 Ile Xaa Gln Thr LeuVal Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Pro    50                  55                  60 Lys Xaa Cys Cys Xaa ProThr Xaa Leu Xaa Ala Xaa Ser Xaa Leu Tyr65                  70                  75                  80 Xaa AspXaa Xaa Xaa Xaa Xaa Val Xaa Leu Xaa Xaa Tyr Xaa Xaa Met                85                  90                  95 Xaa Val XaaXaa Cys Gly Cys Xaa.             100

Each Xaa can be independently selected from a group of one or morespecified amino acids defined as follows, wherein: Xaa2 is Arg or Lys;Xaa3 is Arg or Lys; Xaa4 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser orThr; Xaa5 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa9 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa11 is Arg, Asn, Asp, Gln,Glu, His, Lys, Ser or Thr; Xaa13 is Ile, Leu, Met or Val; Xaa16 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa23 is Arg, Gln, Glu, orLys; Xaa26 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa28 is Phe, Trp or Tyr;Xaa31 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa33 is Asp orGlu; Xaa35 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa39 is Ala, Arg, Asn,Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr or Val; Xaa40 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa41 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa42 is Leu or Met; Xaa44 is Ala, Gly, Pro,Ser, or Thr; Xaa50 is Ile or Val; Xaa55 is Arg, Asn, Asp, Gln, Glu, His,Lys, Ser or Thr; Xaa56 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa57 is Ile,Leu, Met or Val; Xaa58 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr;Xaa59 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa60 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa61 is Arg, Asn, Asp, Gln,Glu, His, Lys, Ser or Thr; Xaa62 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa63is Ile or Val; Xaa66 is Ala, Gly, Pro, Ser, or Thr; Xaa69 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa72 is Arg, Gin, Glu, or Lys; Xaa74 is Arg, Asn,Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa76 is Ile or Val; Xaa78 is Ile,Leu, Met or Val; Xaa81 is Cys, Ile, Leu, Met, Phe, Trp, Tyr or Val;Xaa83 is Asn, Asp or Glu; Xaa84 is Arg, Asn, Asp, Gln, Glu, His, Lys,Ser or Thr; Xaa85 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond;Xaa86 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa87 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa89 is Ile or Val; Xaa91 isArg or Lys; Xaa92 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr;Xaa94 is Arg, Gln, Glu, or Lys; Xaa95 is Asn or Asp; Xaa97 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa99 is Arg, Gln, Glu, or Lys; Xaa100 is Ala,Gly, Pro, Ser, or Thr; Xaa104 is Arg, Asn, Asp, Gln, Glu, His, Lys, Seror Thr.

The GDF subgroup pattern, SEQ. ID. No. 66, accommodates the homologiesshared among members of the GDF subgroup identified to date includingGDF-1, GDF-3, and GDF-9. The generic sequence, shown below, includesboth the conserved amino acids (standard three letter code) as well asalternative amino acids (Xaa) present at the variable positions withinthe sequence and defined by the rules set forth in FIG. 3. GDF SubgroupPattern Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Phe Xaa Xaa Xaa Xaa Trp Xaa1               5                   10                  15 Xaa Trp XaaXaa Ala Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Gly            20                  25                  30 Xaa Cys Xaa XaaXaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa        35                  40                  45 Xaa Xaa Xaa Xaa XaaXaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa    50                  55                  60 Pro Xaa Xaa Xaa Xaa XaaXaa Cys Val Pro Xaa Xaa Xaa Ser Pro Xaa65                  70                  75                  80 Ser XaaLeu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr                85                  90                  95 Glu Asp MetXaa Xaa Xaa Xaa Cys Xaa Cys Xaa.             100                 105

Each Xaa can be independently selected from a group of one or morespecified amino acids defined as follows, wherein: Xaa2 is Arg, Asn,Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa3 is Ala, Arg, Asn, Asp, Cys,Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr orVal; Xaa4 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa5 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa6 is Cys, Ile, Leu, Met,Phe, Trp, Tyr or Val; Xaa7 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa8 isIle, Leu, Met or Val; Xaa9 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser orThr; Xaa11 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa12 isArg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa13 is Ile, Leu, Met orVal; Xaa14 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa16 is Arg, Asn, Asp,Gln, Glu, His, Lys, Ser or Thr; Xaa17 is Arg, Asn, Asp, Gln, Glu, His,Lys, Ser or Thr; Xaa19 is Ile or Val; Xaa20 is Ile or Val; Xaa23 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa24 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa25 is Phe, Trp or Tyr; Xaa26 is Ala, Arg, Asn, Asp, Cys,Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr orVal; Xaa27 is Ala, Gly, Pro, Ser, or Thr; Xaa28 is Arg, Asn, Asp, Gln,Glu, His, Lys, Ser or Thr; Xaa29 is Phe, Trp or Tyr; Xaa31 is Arg, Asn,Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa33 is Arg, Asn, Asp, Gln, Glu,His, Lys, Ser or Thr; Xaa35 is Ala, Gly, Pro, Ser, or Thr; Xaa36 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa37 is Ala, Gly, Pro, Ser, or Thr; Xaa38 isAla, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa39 is Arg, Asn, Asp, Gln, Glu, His,Lys, Ser or Thr; Xaa40 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa41 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa42 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa43is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa44 is Ala, Arg, Asn,Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr, Val or a peptide bond; Xaa45 is Ala, Arg, Asn, Asp, Cys, Glu,Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val ora peptide bond; Xaa46 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond;Xaa47 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa48 is Ala, Gly, Pro, Ser,or Thr; Xaa49 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa50 is Ala, Arg, Asn,Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr or Val; Xaa51 is His, Phe, Trp or Tyr; Xaa52 is Ala, Gly, Pro,Ser, or Thr; Xaa53 is Cys, Ile, Leu, Met, Phe, Trp, Tyr or Val; Xaa54 isIle, Leu, Met or Val; Xaa55 is Arg, Gln, Glu, or Lys; Xaa56 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa57 is Ile, Leu, Met or Val; Xaa58 is Ile, Leu,Met or Val; Xaa59 is His, Phe, Trp or Tyr; Xaa60 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa61 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa62 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa63 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptidebond; Xaa64 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa66 isAla, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa67 is Ala, Arg, Asn, Asp, Cys, Glu,Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val;Xaa68 is Ala, Gly, Pro, Ser, or Thr; Xaa69 is Arg, Asn, Asp, Gln, Glu,His, Lys, Ser or Thr; Xaa70 is Ala, Gly, Pro, Ser, or Thr; Xaa71 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa75 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa76is Arg or Lys; Xaa77 is Cys, Ile, Leu, Met, Phe, Trp, Tyr or Val; Xaa80is Ile, Leu, Met or Val; Xaa82 is Ile, Leu, Met or Val; Xaa84 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa85 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa86is Asp or Glu; Xaa87 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa88 is Arg,Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa89 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa90 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr;Xaa91 is Ile or Val; Xaa92 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa93 isCys, Ile, Leu, Met, Phe, Trp, Tyr or Val; Xaa94 is Arg or Lys; Xaa95 isArg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa100 is Ile or Val;Xaa101 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa102 is Arg, Asn, Asp, Gln,Glu, His, Lys, Ser or Thr; Xaa103 is Arg, Gln, Glu, or Lys; Xaa105 isAla, Gly, Pro, Ser, or Thr; Xaa107 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val.

The Inhibin subgroup pattern, SEQ. ID. No. 67, accommodates thehomologies shared among members of the Inhibin subgroup identified todate including Inhibin α, Inhibin βA and Inhibin βB. The genericsequence, shown below, includes both the conserved amino acids (standardthree letter code) as well as alternative amino acids (Xaa) present atthe variable positions within the sequence and defined by the rules setforth in FIG. 3. Inhibin Subgroup pattern Cys Xaa Xaa Xaa Xaa Xaa XaaXaa Xaa Phe Xaa Xaa Xaa Gly Trp Xaa1               5                   10                  15 Xaa Trp IleXaa Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Tyr Cys Xaa Gly            20                  25                  30 Xaa Cys Xaa XaaXaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa        35                  40                  45 Xaa Xaa Xaa Xaa XaaXaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa    50                  55                  60 Xaa Xaa Xaa Xaa Xaa CysCys Xaa Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa65                  70                  75                  80 Xaa XaaXaa Xaa Xaa Xaa Xaa Asp Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa                85                  90                  95 Xaa Xaa XaaAsn Xaa Xaa Xaa Xaa Xaa Cys Xaa Cys Xaa.            100                 105

Each Xaa can be independently selected from a group of one or morespecified amino acids defined as follows, wherein: Xaa2 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa3 is Arg or Lys; Xaa4 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa5 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa6 is Cys, Ile,Leu, Met, Phe, Trp, Tyr or Val; Xaa7 is Ala, Arg, Asn, Asp, Cys, Glu,Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val;Xaa8 is Ile or Val; Xaa9 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser orThr; Xaa11 is Arg, Gln, Glu, or Lys; Xaa12 is Ala, Arg, Asn, Asp, Cys,Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr orVal; Xaa13 is Ile, Leu, Met or Val; Xaa16 is Asn, Asp or Glu; Xaa17 isArg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa20 is Ile or Val;Xaa21 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa23 is Ala, Gly, Pro, Ser,or Thr; Xaa24 is Ala, Gly, Pro, Ser, or Thr; Xaa25 is Phe, Trp or Tyr;Xaa26 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa27 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr or Val; Xaa28 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr;Xaa31 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa33 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa35 is Ala, Gly, Pro, Ser, or Thr; Xaa36 isAla, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa37 is His, Phe, Trp or Tyr; Xaa38 isIle, Leu, Met or Val; Xaa39 is Ala, Gly, Pro, Ser, or Thr; Xaa40 is Ala,Gly, Pro, Ser, or Thr; Xaa41 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa42 isAla, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa43 is Ala, Gly, Pro, Ser, or Thr;Xaa44 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa45 is Ala, Gly, Pro, Ser,or Thr; Xaa46 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa47 is Ala, Gly, Pro,Ser, or Thr; Xaa48 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa49 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa50 is Ala, Gly, Pro, Ser, or Thr; Xaa51 is Ala,Gly, Pro, Ser, or Thr; Xaa52 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa53 isAla, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa54 is Ala, Arg, Asn, Asp, Cys, Glu,Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val;Xaa55 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa56 is Ala,Arg, Asn, Asp, Cys, Glu, Gin, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa57 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa58is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe,Pro, Ser, Thr, Trp, Tyr or Val; Xaa59 is Ala, Arg, Asn, Asp, Cys, Glu,Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val;Xaa60 is Ala, Arg, Asn, Asp, Cys, Glu, Gin, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa61 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa62 is Ala, Arg, Asn, Asp,Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp,Tyr, Val or a peptide bond; Xaa63 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or apeptide bond; Xaa64 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa65 is Ala, Gly,Pro, Ser, or Thr; Xaa66 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa67 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa68 is Arg, Asn, Asp, Gln, Glu, His, Lys,Ser or Thr; Xaa69 is Ala, Gly, Pro, Ser, or Thr; Xaa72 is Ala, Arg, Asn,Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr,Trp, Tyr or Val; Xaa73 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond;Xaa74 is Ala, Arg, Asn, Asp, Cys, Glu, Gin, Gly, His, Ile, Leu, Lys,Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptide bond; Xaa76 is Ala,Gly, Pro, Ser, or Thr; Xaa77 is Arg, Asn, Asp, Gln, Glu, His, Lys, Seror Thr; Xaa78 is Leu or Met; Xaa79 is Arg, Asn, Asp, Gln, Glu, His, Lys,Ser or Thr; Xaa80 is Ala, Gly, Pro, Ser, or Thr; Xaa81 is Leu or Met;Xaa82 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa83 is Ile,Leu, Met or Val; Xaa84 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His,Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa85 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa86 is Ala, Arg, Asn, Asp, Cys, Glu, Gln,Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa87is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa89 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa90 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly,His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, Val or a peptidebond; Xaa91 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa92 is Arg, Asn, Asp,Gln, Glu, His, Lys, Ser or Thr; Xaa93 is Cys, Ile, Leu, Met, Phe, Trp,Tyr or Val; Xaa94 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile,Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa95 is Ala, Arg,Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser,Thr, Trp, Tyr or Val; Xaa96 is Arg, Gln, Glu, or Lys; Xaa97 is Arg, Asn,Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa98 is Ile or Val; Xaa99 is Ala,Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr or Val; Xaa101 is Leu or Met; Xaa102 is Ile, Leu, Metor Val; Xaa103 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val; Xaa104 is Gln or Glu;Xaa105 is Arg, Asn, Asp, Gln, Glu, His, Lys, Ser or Thr; Xaa107 is Alaor Gly; Xaa109 is Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr or Val.

(2) Biochemical, Structural and Functional Properties of BoneMorphogenic Proteins

In its mature, native form, natural-sourced osteogenic protein is aglycosylated dimer, typically having an apparent molecular weight ofabout 30-36 kDa as determined by SDS-PAGE. When reduced, the 30 kDaprotein gives rise to two glycosylated peptide subunits having apparentmolecular weights of about 16 kDa and 18 kDa. In the reduced state, theprotein has no detectable osteogenic activity. The unglycosylatedprotein, which also has osteogenic activity, has an apparent molecularweight of about 27 kDa. When reduced, the 27 kDa protein gives rise totwo unglycosylated polypeptide chains, having molecular weights of about14 kDa to 16 kDa. Typically, the naturally occurring osteogenic proteinsare translated as a precursor, having an N-terminal signal peptidesequence typically less than about 30 residues, followed by a “pro”domain that is cleaved to yield the mature C-terminal domain. The signalpeptide is cleaved rapidly upon translation, at a cleavage site that canbe predicted in a given sequence using the method of Von Heijne (1986)Nucleic Acids Research 14:4683-4691. Osteogenic proteins useful hereininclude any known naturally-occurring native proteins including allelic,phylogenetic counterpart and other variants thereof, whethernaturally-occurring or biosynthetically produced (e.g., including“muteins” or “mutant proteins”), as well as new, osteogenically activemembers of the general morphogenic family of proteins.

In still another preferred embodiment, useful osteogenically activeproteins have polypeptide chains with amino acid sequences comprising asequence encoded by a nucleic acid that hybridizes, under low, medium orhigh stringency hybridization conditions, to DNA or RNA encodingreference osteogenic sequences, e.g., C-terminal sequences defining theconserved seven cysteine domains of OP-1, OP-2, BMP2, 4, 5, 6, 60A,GDF5, GDF6, GDF7 and the like. As used herein, high stringenthybridization conditions are defined as hybridization according to knowntechniques in 40% formamide, 5×SSPE, 5× Denhardt's Solution, and 0.1%SDS at 37° C. overnight, and washing in 0.1×SSPE, 0.1% SDS at 50° C.Standard stringency conditions are well characterized in commerciallyavailable, standard molecular cloning texts. See, for example, MolecularCloning A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch andManiatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning,Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M.J. Gait ed., 1984): Nucleic Acid Hybridization (B. D. Hames & S. J.Higgins eds. 1984); and B. Perbal, A Practical Guide To MolecularCloning (1984).

Other members of the TGF-β superfamily of related proteins havingutility in the practice of the instant invention include poor refolderproteins among the list: TGF-β1, TGF-β2, TGF-β3, TGF-β4 and TGF-β5,various inhibins, activins, BMP-11, and MIS, to name a few. FIG. 5Clists the C-terminal residues defining the finger 2 subdomain of variousknown members of the TGF-β superfamily. Any one of the proteins on thelist that is a poor refolder can be improved by the methods of theinvention, as can other known or discoverable family members.

B. Production of Recombinant Proteins

As mentioned above, the constructs of the invention can be manufacturedby using conventional recombinant DNA methodologies well known andthoroughly documented in the art, as well as by using well-knownbiosynthetic and chemosynthetic methodologies using routine peptide ornucleotide chemistries and automated peptide or nucleotide synthesizers.Such routine methodologies are described for example in the followingpublications, the teachings of which are incorporated by referenceherein: Hilvert, 1 Chem. Biol. 201-3 (1994); Muir et al., 95 Proc. Natl.Acad. Sci. USA 6705-10 (1998); Wallace, 6 Curr. Opin. Biotechnol. 403-10(1995); Miranda et al., 96 Proc. Natl. Acad. Sci. USA 1181-86 (1999);Liu et al., 91 Proc. Natl. Acad. Sci. USA 6584-88 (1994). Suitable foruse in the present invention are naturally-occurring amino acids andnucleotides; non-naturally occurring amino acids and nucleotides;modified or unusual amino acids; modified bases; amino acid sequencesthat contain post-translaterially modified amino acids and/or modifiedlinkages, cross-links and end caps, non-peptidyl bonds, etc.; and,further including without limitation, those moieties disclosed in theWorld Intellectual Property Organization (WIPO) Handbook on IndustrialProperty Information and Documentation, Standard St. 25 (1998) includingTables 1 through 6 in Appendix 2, herein incorporated by reference.Equivalents of the foregoing will be appreciated by the skilled artisanrelying only on routine experimentation together with the knowledge ofthe art.

For example, the contemplated DNA constructs may be manufactured by theassembly of synthetic nucleotide sequences and/or joining DNArestriction fragments to produce a synthetic DNA molecule. The DNAmolecules then are ligated into an expression vehicle, for example anexpression plasmid, and transfected into an appropriate host cell, forexample E. coli. The contemplated protein construct encoded by the DNAmolecule then is expressed, purified, refolded, tested in vitro forcertain attributes, e.g., binding activity with a receptor havingbinding affinity for the template TGF-β superfamily member, andsubsequently tested to assess whether the biosynthetic construct mimicsother preferred attributes of the template superfamily member.

Alternatively, a library of synthetic DNA constructs can be preparedsimultaneously for example, by the assembly of synthetic nucleotidesequences that differ in nucleotide composition in a preselected region.For example, it is contemplated that during production of a constructbased upon a specific TGF-β superfamily member, the artisan can chooseappropriate finger and heel regions for such a superfamily member (forexample from FIGS. 5-6). Once the appropriate finger and heel regionshave been selected, the artisan then can produce synthetic DNA encodingthese regions. For example, if a plurality of DNA molecules encodingdifferent linker sequences are included into a ligation reactioncontaining DNA molecules encoding finger and heel sequences, byjudicious choice of appropriate restriction sites and reactionconditions, the artisan may produce a library of DNA constructs whereineach of the DNA constructs encode finger and heel regions but connectedby different linker sequences. The resulting DNAs then are ligated intoa suitable expression vehicle, i.e., a plasmid useful in the preparationof a phage display library, transfected into a host cell, and thepolypeptides encoded by the synthetic DNAs expressed to generate a poolof candidate proteins. The pool of candidate proteins subsequently canbe screened to identify specific proteins having binding affinity and/orselectivity for a pre-selected receptor.

Screening can be performed by passing a solution comprising thecandidate proteins through a chromatography column containing surfaceimmobilized receptor. Then proteins with the desired binding specificityare eluted, for example by means of a salt gradient and/or aconcentration gradient of the template TGF-β superfamily member.Nucleotide sequences encoding such proteins subsequently can be isolatedand characterized. Once the appropriate nucleotide sequences have beenidentified, the lead proteins subsequently can be produced, either byconventional recombinant DNA or peptide synthesis methodologies, inquantities sufficient to test whether the particular construct mimicsthe activity of the template TGF-β superfamily member.

It is contemplated that, which ever approach is adopted to produce DNAmolecules encoding constructs of the invention, the tertiary structureof the preferred proteins can subsequently be modulated in order tooptimize binding and/or biological activity by, for example, by acombination of nucleotide mutagenesis methodologies aided by theprinciples described herein and phage display methodologies.Accordingly, an artisan can produce and test simultaneously largenumbers of such proteins.

(1) Gene Synthesis.

The processes for manipulating, amplifying, and recombining DNA whichencode amino acid sequences of interest generally are well known in theart, and therefore, are not described in detail herein. Methods ofidentifying and isolating genes encoding members of the TGF-βsuperfamily and their cognate receptors also are well understood, andare described in the patent and other literature.

Briefly, the construction of DNAs encoding the biosynthetic constructsdisclosed herein is performed using known techniques involving the useof various restriction enzymes which make sequence specific cuts in DNAto produce blunt ends or cohesive ends, DNA ligases, techniques enablingenzymatic addition of sticky ends to blunt-ended DNA, construction ofsynthetic DNAs by assembly of short or medium length oligonucleotides,cDNA synthesis techniques, polymerase chain reaction (PCR) techniquesfor amplifying appropriate nucleic acid sequences from libraries, andsynthetic probes for isolating genes of members of the TGF-b superfamilyand their cognate receptors. Various promoter sequences from bacteria,mammals, or insects to name a few, and other regulatory DNA sequencesused in achieving expression, and various types of host cells are alsoknown and available. Conventional transfection techniques, and equallyconventional techniques for cloning and subcloning DNA are useful in thepractice of this invention and known to those skilled in the art.Various types of vectors may be used such as plasmids and virusesincluding animal viruses and bacteriophages. The vectors may exploitvarious marker genes which impart to a successfully transfected cell adetectable phenotypic property that can be used to identify which of afamily of clones has successfully incorporated the recombinant DNA ofthe vector.

One method for obtaining DNA encoding the biosynthetic constructsdisclosed herein is by assembly of synthetic oligonucleotides producedin a conventional, automated, oligonucleotide synthesizer followed byligation with appropriate ligases. For example, overlapping,complementary DNA fragments may be synthesized using phosphoramiditechemistry, with end segments left unphosphorylated to preventpolymerization during ligation. One end of the synthetic DNA is leftwith a “sticky end” corresponding to the site of action of a particularrestriction endonuclease, and the other end is left with an endcorresponding to the site of action of another restriction endonuclease.The complimentary DNA fragments are ligated together to produce asynthetic DNA construct.

Alternatively nucleic acid strands encoding finger 1, finger 2 and heelregions may be isolated from libraries of nucleic acids, for example, bycolony hybridization procedures such as those described in Sambrook etal. eds. (1989) “Molecular Cloning”, Coldspring Harbor LaboratoriesPress, NY, and/or by PCR amplification methodologies, such as thosedisclosed in Innis et al. (1990) “PCR Protocols, A guide to methods andapplications”, Academic Press. The nucleic acids encoding the finger andheel regions then are joined together to produce a synthetic DNAencoding the biosynthetic single-chain morphon construct of interest.

It is appreciated, however, that a library of DNA constructs encoding aplurality of morphons may be produced simultaneously by standardrecombinant DNA methodologies, such as the ones, described above, Forexample, the skilled artisan by the use of cassette mutagenesis oroligonucleotide directed mutagenesis may produce, for example, a seriesof DNA constructs each of which contain different DNA sequences within apredefined location, e.g., within a DNA cassette encoding a linkersequence. The resulting library of DNA constructs subsequently may beexpressed, for example, in a phage display library and any proteinconstructs that binds to a specific receptor may be isolated by affinitypurification, e.g., using a chromatographic column comprising surfaceimmobilized receptor (see section V below). Once molecules that bind thepreselected receptor have been isolated, their binding and agonistproperties may be modulated using the empirical refinement techniquesalso discussed in section V, below.

Methods of mutagenesis of proteins and nucleic acids are well known andwell described in the art. See, e.g., Sambrook et al., (1990) MolecularCloning: A Laboratory Manual., 2d ed. (Cold Spring Harbor, N.Y.: ColdSpring Harbor Laboratory Press). Useful methods include PCR (overlapextension, see, e.g., PCR Primer (Dieffenbach and Dveksler, eds., ColdSpring Harbor Press, Cold Spring Harbor, N.Y., 1995, pp. 603-611);cassette mutagenesis and single-stranded mutagenesis following themethod of Kunkel. It will be appreciated by the artisan that anysuitable method of mutagenesis can be utilized and the mutagenesismethod is not considered a material aspect of the invention. Thenucleotide codons competent to encode amino acids, including arginine(Arg), glutamic acid (Glu) and aspartic acid (Asp) also are well knownand described in the art. See, for example, Lehninger, Biochemistry,(Worth Publishers, N.Y., N.Y.) Standard codons encoding arginine,glutamic acid and aspartic acid are: Arg: CGU, CGC, CGA, CGG, AGA, AGG;Glu: GAA, GAG; and Asp: GAU, GAC. Chimeric constructs of the inventioncan readily be constructed by aligning the nucleic acid sequences ofprotein regions, or domains to be switched, and identifying compatiblesplice sites and/or constructing suitable crossover sequences using PCRoverlap extension.

The mutant forms of TGF-β family members of the present invention can beproduced in bacteria using standard, well-known methods. Full-lengthmature forms or shorter sequences defining only the C-terminal sevencysteine domain can be provided to the host cell. It may be preferred tomodify the N-terminal sequences of the mutant forms of the protein inorder to optimize bacterial expression. For example, the preferred formof native OP-1 for bacterial expression is the sequence encoding themature, active sequence (residues 293-431 of SEQ No. 39 or a fragmentthereof encoding the C-terminal seven cysteine domain (e.g., residues330-431 of SEQ ID NO: 39). A methionine can be introduced at position293, replacing the native serine residue, or it can precede this serineresidue. Alternatively, a methionine can be introduced anywhere withinthe first thirty-six residues of the natural sequence (residues293-329), up to the first cysteine of the TGF-β domain. The DNA sequencefurther can be modified at its N-terminus to improve purification, forexample, by adding a “hexa-his” tail to assist purification on an IMACcolumn; or by using a FB leader sequence, which facilitates purificationon an IgG/column. These and other methods are well described and wellknown in the art. Other bacterial species and/or proteins may require orbenefit from analogous modifications to optimize the yield of the mutantBMP obtained therefrom. Such modifications are well within the level ofordinary skill in the art and are not considered material aspects of theinvention.

The synthetic nucleic acids preferably are inserted into a vectorsuitable for overexpression in the host cell of choice. Any expressionvector can be used, so long as it is capable of directing the expressionof a heterologous protein such as a BMP in the host cell of choice.Useful vectors include plasmids, phagemids, mini chromosomes and YACs,to name a few. Other vector systems are well known and characterized inthe art. The vector typically includes a replicon, one or moreselectable marker gene sequences, and means for maintaining a high copynumber of the vector in the host cell. Well known selectable markergenes include antibiotics like ampicillin, tetracycline and the like, aswell as resistance to heavy metals. Useful selectable marker genes foruse in yeast cells include the URA3, LEU2, HIS3 or TRP1 gene for usewith an auxotrophic yeast mutant host. In addition, the vector alsoincludes a suitable promoter sequence for expressing the gene ofinterest and which may or may not be inducible, as desired, as well asuseful transcription and translation initiation sites, terminators, andother sequences that can maximize transcription and translation of thegene of interest. Well characterized promotors particularly useful inbacterial cells include the lac, tac, trp, and tpp promoters, to name afew. Promoters useful in yeast include ADHI, ADHII, or PHO5 promoter,for example.

Suitable host cells include microbial cells such as Bacillus subtilis(B. subtilis), species of Pseudomonas, Escherichia coli (E. coli), andyeast cells, e.g., Saccharomyces cereviceae. Other hosts cells, forexample mammalian cells such as CHO, can be used.

The gene of interest can be transformed into the host cell of choiceusing standard microbiology techniques (electroporation or calciumchloride, for example) and the cells induced to grow under suitableconditions. Cell culturing media are well described in the art,including numerous well known texts, including Sambrook, et al. Usefulmedia include LB (Luria's Broth) and Dulbecco's DMEM. The overexpressedprotein can be collected from insoluble, refractile inclusion bodies bystandard techniques, including cell lysis or mechanical disruption ofthe cell (Frenchpress, SLM Instruments, Inc, for example) followed bycentrifugation and resolubilization (see below).

For example, if the gene is to be expressed in E. coli, it is clonedinto an appropriate expression vector. This can be accomplished bypositioning the engineered gene downstream of a promoter sequence suchas Trp or Tac, and/or a gene coding for a leader peptide such asfragment B of protein A (FB). During expression, the resulting fusionproteins accumulate in refractile bodies in the cytoplasm of the cells,and may be harvested after disruption of the cells by French press orsonication. The isolated refractile bodies then are solubilized, and theexpressed proteins folded and the leader sequence cleaved, if necessary,by methods already established with many other recombinant proteins.

Expression of the engineered genes in eukaryotic cells requires cellsand cell lines that are easy to transfect, are capable of stablymaintaining foreign DNA with an unrearranged sequence, and which havethe necessary cellular components for efficient transcription,translation, post-translation modification, and secretion of theprotein. In addition, a suitable vector carrying the gene of interestalso is necessary. DNA vector design for transfection into mammaliancells should include appropriate sequences to promote expression of thegene of interest as described herein, including appropriatetranscription initiation, termination, and enhancer sequences, as wellas sequences that enhance translation efficiency, such as the Kozakconsensus sequence. Preferred DNA vectors also include a marker gene andmeans for amplifying the copy number of the gene of interest. A detailedreview of the state of the art of the production of foreign proteins inmammalian cells, including useful cells, protein expression-promotingsequences, marker genes, and gene amplification methods, is disclosed inBendig (1988) Genetic Engineering 7:91-127.

The best characterized transcription promoters useful for expressing aforeign gene in a particular mammalian cell are the SV40 early promoter,the adenovirus promoter (AdMLP), the mouse metallothionein-I promoter(mMT-I), the Rous sarcoma virus (RSV) long terminal repeat (LTR), themouse mammary tumor virus long terminal repeat (MMTV-LTR), and the humancytomegalovirus major intermediate-early promoter (hCMV). The DNAsequences for all of these promoters are known in the art and areavailable commercially.

The use of a selectable DHFR gene in a dhfr⁻ cell line is a wellcharacterized method useful in the amplification of genes in mammaliancell systems. Briefly, the DHFR gene is provided on the vector carryingthe gene of interest, and addition of increasing concentrations of thecytotoxic drug methotrexate, which is metabolized by DHFR, leads toamplification of the DHFR gene copy number, as well as that of theassociated gene of interest. DHFR as a selectable, amplifiable markergene in transfected chinese hamster ovary cell lines (CHO cells) isparticularly well characterized in the art. Other useful amplifiablemarker genes include the adenosine deaminase (ADA) and glutaminesynthetase (GS) genes.

The choice of cells/cell lines is also important and depends on theneeds of the experimenter. COS cells provide high levels of transientgene expression, providing a useful means for rapidly screening thebiosynthetic constructs of the invention. COS cells typically aretransfected with a simian virus 40 (SV40) vector carrying the gene ofinterest. The transfected COS cells eventually die, thus preventing thelong term production of the desired protein product. However, transientexpression does not require the time consuming process required for thedevelopment of a stable cell line, and thus provides a useful techniquefor testing preliminary constructs for binding activity.

The various cells, cell lines and DNA sequences that can be used formammalian cell expression of the single-chain constructs of theinvention are well characterized in the art and are readily available.Other promoters, selectable markers, gene amplification methods andcells also may be used to express the proteins of this invention.Particular details of the transfection, expression, and purification ofrecombinant proteins are well documented in the art and are understoodby those having ordinary skill in the art. Further details on thevarious technical aspects of each of the steps used in recombinantproduction of foreign genes in mammalian cell expression systems can befound in a number of texts and laboratory manuals in the art, such as,for example, F. M. Ausubel et al, ed., Current Protocols in MolecularBiology, John Wiley & Sons, New York, (1989).

C. Refolding Considerations

The protein, once isolated from inclusion bodies, is solubilized using adenaturant or chaotropic agent such as guanidine HCl or urea, preferablyin the range of about 4-9 M and at an elevated temperature (e.g., 25-37°C.) and/or basic pH (8-10). Alternatively, the proteins can besolubilized by acidification, e.g., with acetic acid or trifluoroaceticacid, generally at a pH in the range of 1-4. Preferably, a reducingagent such as β-mercaptoethanol or dithiothreitol (DTT) is used inconjunction with the solubilizing agent. The solubilized heterologousprotein can be purified further from solubilizing chaotropes by dialysisand/or by known chromatographic methods such as size exclusionchromatography, ion exchange chromatography, or reverse phase highperformance liquid chromatography (RP-HPLC), for example.

The solubilized protein can be refolded as follows. The dissolvedprotein is diluted in a refolding medium, typically a Tris-bufferedmedium having a pH in the range of about pH 5.0-10.0, preferably in therange of about pH 6-9 and one which includes a detergent and/orchaotropic agent. Useful commercially available detergents can be ionic,nonionic or zwitterionic, such as NP40 (Nonidet 40), CHAPS (such as3-[(3-cholamido-propyl)dimethylammonio]-1-propane-sulfate, digitonin,deoxycholate, or N-octyl glucoside. Useful chaotropic agents includeguanidine, urea, or arginine. Preferably the detergent or chaotropicagent is present at a concentration in the range of about 0.1-10M,preferably in the range of about 0.5-4M. When CHAPS is the detergent, itpreferably comprises about 0.5-5% of the solution, more preferably about1-3% of the solution. Preferably the solution also includes a suitableredox system such as the oxidized and reduced forms of glutathione, DTT,β-mercaptoethanol, β-mercaptomethanol, cysteine or cystamine, to name afew. Preferably, the redox systems are present at ratios of reductant tooxidant in the range of about 1:1 to about 5:1. When the glutathioneredox system is used, the ratio of reduced glutathione to oxidizedglutathione is preferably is in the range of about 0.5 to 5; morepreferably 1 to 1; and most preferably 2 to 1 of reduced form tooxidized form. Preferably the buffer also contains a salt, typicallyNaCl, present in the range of about 0.25M-2.5 M, preferably in the rangeof about 0.5-1.5M, most preferably in the range of about 1M. One skilledin the art will recognize that the above conditions and media may bevaried using no more than ordinary experimentation. Such variations andmodifications are within the scope of the present invention.

Preferably the protein concentration for a given refolding reaction isin the range of about 0.001-1.0 mg/ml, more preferably it is in therange of about 0.05-0.25 mg/ml, most preferably in the range of about0.075-0.125 mg/ml. As will be appreciated by the skilled artisan, higherconcentrations tend to produce more aggregates. Where heterodimers areto be produced (for example an OP1/BMP2 or BMP2/BMP6 heterodimer)preferably the individual proteins are provided to the refolding bufferin equal amounts.

Typically, the refolding reaction takes place at a temperature rangefrom about 4° C. to about 25° C. More preferably, the refolding reactionis performed at 4° C., and allowed to go to completion. Refoldingtypically is complete in about one to seven days, generally within 16-72hours or 24-48 hours, depending on the protein. As will be appreciatedby the skilled artisan, rates of refolding can vary by protein, andlonger and shorter refolding times are contemplated and within the scopeof the present invention. As used herein, a “good refolder” protein isone where at least 20% of the protein is present in dimeric formfollowing a folding reaction when compared to the total protein in therefolding reaction, as measured by any of the refolding assays describedherein and without requiring further purification. Native BMPs that areconsidered in the art to be “good refolder” proteins include BMP2,CDMP1, CDMP2 and CDMP3. BMP-3 also refolds reasonably well. In contrast,a “poor refolder” protein yields less than 1% of properly-foldedprotein.

Properly refolded dimeric proteins readily can be assessed using any ofa number of well known and well characterized assays. In particular, anyone or more of three assays, all well known and well described in theart, and further described below can be used to advantage. Usefulrefolding assays include one or more of the following. First, thepresence of dimers can be detected visually either by standard SDS-PAGEin the absence of a reducing agent such as DTT or by HPLC (e.g., C18reverse phase HPLC). BMP dimeric proteins have an apparent molecularweight in the range about 28-36 kDa, as compared to monomeric subunits,which have an apparent molecular weight of about 14-18 kDa. The dimericprotein can readily be visualized on an electrophoresis gel bycomparison to commercially available molecular weight standards. Thedimeric protein also elutes from a C18 RP HPLC (45-50% acetonitrile:0.1% TFA) at about 19 minutes (mammalian produced hOP-1 elutes at 18.95minutes).

A second assay evaluates the presence of dimer by its ability to bind tohydroxyapatite. Properly-folded dimer binds a hydroxyapatite column wellin the presence of 0.1-0.2M NaCl (dimer elutes at 0.25 M NaCl) ascompared to monomer, which does not bind substantially at thoseconcentrations (monomer elutes at 0.1M NaCl).

A third assay evaluates the presence of dimer by the protein's resistantto trypsin or pepsin digestion. The folded dimeric species issubstantially resistant to both enzymes, particularly trypsin, whichcleaves only a small portion of the N-terminus of the mature protein,leaving a biologically active dimeric species only slightly smaller insize than the untreated dimer. By contrast, the monomer is substantiallydegraded. In the assay, the protein is subjected to an enzyme digestusing standard conditions, e.g., digestion in a standard buffer such as50 mM Tris buffer, pH 8, containing 4 M urea, 100 mM NaCl, 0.3% Tween-80and 20 mM methylamine. Digestion is allowed to occur at 37° C. for onthe order of 16 hours, and the product visualized by any suitable means,preferably SDS-PAGE.

The biological activity of the refolded TGF-β family protein readily canbe assessed by any of a number of means. A BMP's ability to induceendochondral bone formation can be evaluated using the wellcharacterized rat subcutaneous bone assay, described in the art and indetail below. In the assay bone formation is measured by histology, aswell as by alkaline phosphatase and/or osteoclacin production. Inaddition, osteogenic proteins having high specific bone formingactivity, such as OP-1, BMP-2, BMP-4, BMP5 and BMP6, also inducealkaline phosphatase activity in an in vitro rat osteoblast orosteosarcoma cell-based assay. Such assays are well described in the artand are detailed herein below. See, for example, Sabokdar et al. (1994)Bone and Mineral 27:57-67; Knutsen et al. (1993) Biochem. Biophys. Res.Commun. 194:1352-1358; and Maliakal et al. (1994) Growth Factors1:227-234). By contrast, osteogenic proteins having low specific boneforming activity, such as CDMP-1 and CDMP-2, for example, do not inducesimilar levels of alkaline phosphatase activity in the cell basedosteoblast assay. The assay thus provides a ready method for evaluatingbiological activity mutants of BMPs. For example, CDMP 1, CDMP2 andCMDP3 all are competent to induce bone formation, although with a lowerspecific activity than BMP2, BMP4, BMP5, BMP6 or OP-1. Conversely, BMP2,BMP4, BMP5, BMP6 and OP-1 all can induce articular cartilage formation,albeit with a lower specific activity than CDMP 1, CDMP2 or CDMP3.Accordingly, a CDMP mutant competent to induce alkaline phosphataseactivity in the cell-based assay of Example 5 is expected to demonstratea higher specific bone forming activity in the rat animal bioassay.Similarly, an OP-1 mutant containing a substitution present in acorresponding position of a CDMP 1, CDMP2 or CDMP3 protein, andcompetent to induce bone in the rat assay but not to induce alkalinephosphatase activity in the cell based assay, is expected to have ahigher specific articular cartilage inducing activity in an in vivoarticular cartilage assay. As described herein below, a suitable invitro assay for CDMP activity utilizes mouse embyronic osteoprogenitoror carcinoma cells, such as ATDC5 cells. See Example 6, below.

TGF-β activity can be readily evaluated by the protein's ability toinhibit epithelial cell growth. A useful, well characterized in vitroassay utilizes mink lung cells or melanoma cells. See Example 7. Otherassays for other members of the TGF-8 superfamily are well described inthe literature and can be performed without undue experimentation.

D. Formulation and Bioactivity

The resulting chimeric proteins can be provided to an individual as partof a therapy to enhance, inhibit, or otherwise modulate in vivo events,such as but not limited to, the binding interaction between a TGF-βsuperfamily member and one or more of its cognate receptors. Theconstructs may be formulated in a pharmaceutical composition, asdescribed below, and may be administered in morphogenic effectiveamounts by any suitable means, preferably directly or systematically,e.g., parenterally or orally. Resulting DNA constructs encodingpreferred chimeric proteins can also be administered directly to arecipient for gene therapeutic purposes; such DNAs can be administeredwith or without carrier components, or with or without matrixcomponents. Alternatively, cells transferred with such DNA constructscan be implanted in a recipient. Such materials and methods arewell-known in the art.

Where any of the constructs disclosed here are to be provided directly(e.g., locally, as by injection, to a desired tissue site), orparentally, such as by intravenous, subcutaneous, intramuscular,intraorbital, ophthalmic, intraventricular, intracranial, intracapsular,intraspinal, intracisternal, intraperitoneal, buccal, rectal, vaginal,intranasal or by aerosol administration, the therapeutic compositionpreferably comprises part of an aqueous solution. The solutionpreferably is physiologically acceptable so that in addition to deliveryof the desired construct to the patient, the solution does not otherwiseadversely affect the patient's electrolyte and volume balance. Theaqueous medium for the therapeutic molecule thus may comprise, forexample, normal physiological saline (0.9% NaCl, 0.15M), pH 7-7.4 orother pharmaceutically acceptable salts thereof.

Useful solutions for oral or parenteral administration may be preparedby any of the methods well known in the pharmaceutical art, described,for example, in Remington's Pharmaceutical Sciences, (Gennaro, A., ed.),Mack Pub., 1990. Formulations may include, for example, polyalkyleneglycols such as polyethylene glycol, oils of vegetable origin,hydrogenated naphthalenes, and the like. Formulations for directadministration, in particular, may include glycerol and othercompositions of high viscosity. Biocompatible, preferably bioresorbablepolymers, including, for example, hyaluronic acid, collagen, tricalciumphosphate, polybutyrate, polylactide, polyglycolide andlactide/glycolide copolymers, may be useful excipients to control therelease of the morphogen in vivo.

Other potentially useful parenteral delivery systems for thesetherapeutic molecules include ethylene-vinyl acetate copolymerparticles, osmotic pumps, implantable infusion systems, and liposomes.Formulations for inhalation administration may contain as excipients,for example, lactose, or may be aqueous solutions containing, forexample, polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate,or oily solutions for administration in the form of nasal drops, or as agel to be applied intranasally.

Finally, therapeutic molecules may be administered alone or incombination with other molecules known to effect tissue morphogenesis,i.e., molecules capable of tissue repair and regeneration and/orinhibiting inflammation. Examples of useful cofactors for stimulatingbone tissue growth in osteoporotic individuals, for example, include butare not limited to, vitamin D₃, calcitonin, prostaglandins, parathyroidhormone, dexamethasone, estrogen and IGF-I or IGF-II. Useful cofactorsfor nerve tissue repair and regeneration may include nerve growthfactors. Other useful cofactors include symptom-alleviating cofactors,including antiseptics, antibiotics, antiviral and antifungal agents andanalgesics and anesthetics.

Therapeutic molecules further can be formulated into pharmaceuticalcompositions by admixture with pharmaceutically acceptable nontoxicexcipients and carriers. As noted above, such compositions may beprepared for parenteral administration, particularly in the form ofliquid solutions or suspensions; for oral administration, particularlyin the form of tablets or capsules; or intranasally, particularly in theform of powders, nasal drops or aerosols. Where adhesion to a tissuesurface is desired the composition may include the biosyntheticconstruct dispersed in a fibrinogen-thrombin composition or otherbioadhesive such as is disclosed, for example in PCT US91/09275, thedisclosure of which is incorporated herein by reference. The compositionthen may be painted, sprayed or otherwise applied to the desired tissuesurface. The compositions can be formulated for parenteral or oraladministration to humans or other mammals in therapeutically effectiveamounts, e.g., amounts which provide appropriate concentrations of themorphon to target tissue for a time sufficient to induce the desiredeffect.

Where the therapeutic molecule comprises part of a tissue or organpreservation solution, any commercially available preservation solutionmay be used to advantage. For example, useful solutions known in the artinclude Collins solution, Wisconsin solution, Belzer solution,Eurocollins solution and lactated Ringer's solution. A detaileddescription of preservation solutions and useful components may befound, for example, in U.S. Pat. No. 5,002,965, the disclosure of whichis incorporated herein by reference.

It is contemplated that some of the protein constructs, for examplethose based upon members of the Vg/dpp subgroup, will also exhibit highlevels of activity in vivo when combined with a matrix. See for example,U.S. Pat. No. 5,266,683 the disclosure of which is incorporated byreference herein. The currently preferred matrices are xenogenic,allogenic or autogenic in nature. It is contemplated, however, thatsynthetic materials comprising polylactic acid, polyglycolic acid,polybutyric acid, derivatives and copolymers thereof can also be used togenerate suitable matrices. Preferred synthetic and naturally derivedmatrix materials, their preparation, methods for formulating them withthe morphogenic proteins of the invention, and methods of administrationare well known in the art and so are not discussed in detailed herein.See for example, U.S. Pat. No. 5,266,683, the disclosure of which isherein incorporated by reference. It is further contemplated thatbinding to, adherence to or association with a matrix or the metalsurface of a prosthetic device is an attribute that can be altered usingthe materials and methods disclosed herein. For example, devicescomprising a matrix and an osteoactive construct of the presentinvention having enhanced matrix-adherent properties can be used as aslow-release device. The skilled artisan will appreciate the variationand manipulations now possible in light of the teachings herein.

As will be appreciated by those skilled in the art, the concentration ofthe compounds described in a therapeutic composition will vary dependingupon a number of factors, including the morphogenic effective amount tobe administered, the chemical characteristics (e.g., hydrophobicity) ofthe compounds employed, and the route of administration: The preferreddosage of drug to be administered also is likely to depend on suchvariables as the type and extent of a disease, tissue loss or defect,the overall health status of the particular patient, the relativebiological efficacy of the compound selected, the formulation of thecompound, the presence and types of excipients in the formulation, andthe route of administration. In general terms, the therapeutic moleculesof this invention may be provided to and individual where typical dosesrange from about 10 ng/kg to about 1 g/kg of body weight per day; with apreferred dose range being from about 0.1 mg/kg to 100 mg/kg of bodyweight.

II. Specific Modified Protein Constructs

Generally, the present invention relates to four types of modified TGF-βfamily protein constructs: (1) TGF-β family proteins which are truncatedat the N-terminal region, (2) “latent” proteins that can be activatedupon cleavage, including, but not limited to, release of an N-terminalsequence (e.g., by acid cleavage or protease treatment), (3) fusionproteins with specific binding capabilities and (4) heterodimersconsisting of naturally-occurring or modified subunits of TGF-β familymembers. Particular species of these morphogen constructs are describedin detail below. The species exemplified below generally relate tomodified morphogen or osteogenic protein constructs, but the skilledpractitioner will appreciate that these constructs are representative ofsimilar constructs that can be generated with other members of the TGF-βsuper family.

According to the present invention, the attributes of native BMPs orother members of the TGF-β superfamily of proteins, includingheterodimers and homodimers thereof, are altered by modifying theN-terminus of a native protein to alter one or more biologicalproperties of a BMP or TGF-β superfamily member. As a result of thisdiscovery, it is possible to design, TGF-β superfamily proteins that (1)are expressed recombinantly in prokaryotic or eukaryotic cells orsynthesized using polypeptide synthesizers; (2) have altered foldingattributes; (3) have altered solubility under neutral pHs, including butnot limited to physiologically compatible conditions; (4) have alteredisoelectric points; (5) have altered stability; (6) have an alteredtissue or receptor specificity; (7) have a re-designed, alteredbiological activity; and/or (8) have altered binding or adherenceproperties to solid surfaces, such as but not limited to, biocompatiblematrices or metals. Thus, the present invention can provide mechanismsfor designing quick-release, slow-release and/or timed-releaseformulations containing a preferred protein construct. Other advantagesand features will be evident from the teachings below. Moreover, makinguse of the discoveries disclosed herein, modified proteins havingaltered surface-binding/surface-adherent properties can be designed andselected. Surfaces of particular significance include, but are notlimited to, solid surfaces which can be naturally-occurring such asbone; or porous particulate surfaces such as collagen or otherbiocompatible matrices; or the flabricated surfaces of prostheticimplants, including metals. As contemplated herein, virtually anysurface can be assayed for differential binding of constructs. Thus, thepresent invention embraces a diversity of functional molecules havingalterations in their surface-binding/surface-adherent properties,thereby rendering such constructs useful for altered in vivoapplications, including slow-release, fast-release and/or timed-releaseformulations.

The skilled artisan will appreciate that mixing-and-matching any one ormore the above-recited attributes provides specific opportunities tomanipulate the uses of customized proteins (and DNAs encoding the same).For example, the attribute of altered stability can be exploited tomanipulate the turnover of a protein in vivo. Moreover, in the case ofproteins also having attributes such as altered re-folding and/orfunction, there is likely an interconnection between folding, functionand stability. See, for example, Lipscomb et al., 7 Protein Sci. 765-73(1998); and Nikolova et al., 95 Proc. Natl. Acad. Sci. USA 14675-80(1998). For purposes of the present invention, stability alterations canbe routinely monitored using well-known techniques of circular dichroismother indices of stability as a function of denaturant concentration ortemperature. One can also use routine scanning calorimetry. Similarly,there is likely an interconnection between any of the foregoingattributes and the attribute of solubility. In the case of solubility,it is possible to manipulate this attribute so that a protein constructis either more or less soluble under physiologically-compatibleconditions and it consequently diffuses readily or remains localized,respectively, when administered in vivo.

In addition to the aforementioned uses of protein constructs withaltered attributes, those with altered stability can also be used topractical advantage for shelf-life, storage and/or shippingconsiderations. Furthermore, on a related matter, altered stability canalso directly affect dosage considerations thereby, for example,reducing the cost of treatment.

A particularly significant class of constructs are those having alteredbinding to solubilized carriers or excipients. By way of non-limitingexample, an altered BMP having enhanced binding to a solubilized carriersuch as hyaluronic acid permits the skilled artisan to administer aninjectable formulation at a defect site without loss or dilution of theBMP by either diffusion or body fluids. Thus localization is maximized.The skilled artisan will appreciate the variations made possible by theinstant teachings. Similarly, another class of constructs having alteredbinding to body/tissue components can be exploited. By way ofnon-limiting example, an altered BMP having diminished binding to anin-situ inhibitor can be used to enhance repair of certain tissues invivo. It is well known in the art, for example, that cartilage tissue isassociated with certain proteins found in body fluids and/or withincartilage per se that can inhibit the activity of native BMPs. Chimericconstructs with altered binding properties, however, can overcome theeffects of these in-situ inhibitors thereby enhancing repair, etc. Theskilled artisan will appreciate the variations made possible by theinstant teachings.

A. Truncation

There are different forms of OP-1, such as 23k, 17k, and variableamounts of 15k, whereby the typical OP-1 preparation contains all thesespecies. N-terminal sequencing of purified mature OP-1 has revealedheterogeneity showing that the N-terminus can be more or less truncated.Through experiments with the species retrieved by elution from RP-HPLCand by trypsin cleavage, ROS activity is greatest among the 15k species.For example, truncated mutant H2469 has relatively high activity bycomparison with the CHO-derived OP-1 standard. Whereas initialmaturation occurs in pro-OP-1 at the RXXR site resulting in the 17kspecies, a secondary maturation by a different protease produces themost active 15k species. Trypsin cleavage can mimic this secondaryactivation.

Trypsin treatment of mammalian OP-1 or E-coli refolded OP-1 results inincreased ROS activity. Removal of the N-terminus of the constructsdescribed herein (e.g., hexa-his, collagen binding site, and BMP-2N-terminus) also resulted in increased activity in a ROS assay.Truncation of OP-1 can increase solubility of the morphogen, which canaffect ROS activity. Thus, constructs can be created having specificcleavage activity, that is, they are selective for the type of cleavageand the timing of the cleavage. One skilled in the art will appreciatethat cleavage activity may differ based on the system used (mammalian orprokaryote). For example, a mammalian system may require that themorphogen construct include a pro region, which in the context of theconstruct, could disrupt folding and consequently will result (in themammalian system), in complete intracellular degradation with no proteinat the end. It may also be desirable to produce other constructs thatinclude the pro-protein form. In such constructs, the pro-domain can beconsidered as another N-terminal element which can be cleaved to obtainincreased activity. The skilled practitioner will appreciate that theuncleaved pro-protein can be utilized to take advantage of itsattributes (relating to solubility and activity).

The mutant proteins of the present invention exhibit improved biologicalactivity as well as extended half-life. Further, increased activityobserved with the truncated proteins of the present invention may be dueto elimination of basic residues and/or the lowering of the protein'sisoelectric point. Biological activity and improved refolding can beenhanced when the modified proteins of the present invention arecombined with the modifications described in copending application Ser.No. ______ [Atty Docket No. STK-076, filed on ______] and Ser. No.______ [Atty Docket No. STK-077, filed on ______], the disclosures ofwhich are incorporated herein by reference.

B. N-Terminal Regions with Specific Properties

Additional modified proteins of the invention comprise peptides ofnon-morphogen origin fused to the N-terminus of a morphogen 7-cysteinedomain. See e.g., FIGS. 7A-7E. The resulting N-terminal fusion proteinshave additional biological or biochemical properties not present in theunmodified morphogen from which the fusion is derived. Fusions of thistype comprise a morphogen 7-cysteine domain fused at its N-terminus to aprotein, or protein fragment, such as a collagen binding domain, an FBdomain of protein A, or a hexa-histidine region. For example, H2440 isOP-1 with a hexa-his tag attached to its N-terminus as a binding domainfor IMAC (immobilized metal affinity chromatography) resin. (FIG. 7B).This protein has been purified over copper IMAC resin, initially in itsunfolded state, in the presence of urea. After the purification of theunfolded protein on IMAC, followed by refolding, the successfullyrefolded fraction is purified by RP-HPLC. Such N-terminal fusionproteins display little or no activity in a ROS assay, but are activatedupon cleavage of the N-terminal non-morphogen peptide to yield an activeC-terminal morphogen domain.

Particularly preferred are those engineered OP-1 constructs that cantarget specific sites. For example, an OP-1 with a N-terminaldecapeptide collagen binding domain was constructed, H2487, in which thedecapeptide was placed 7 residues upstream from the first cysteine (seeFIG. 7A) to obtain specific and tight binding of OP-1 to bone matrix.This new construct was successfully refolded and active in the ROSassay, thereby indicating specific bone forming activity. Other bindingdomains can be used similarly to direct activity. For example, in thecontext of cartilage repair, OP-1 can also be engineered to specificallyadhere to prosthetic devices. Other peptides, such as a peptide derivedfrom Clostridium collagenase, can also be explored for collagen bindingproperties.

One of ordinary skill in the art will appreciate that the techniques ofthe present invention can be used to generate specific modified proteinformulations that are capable of environmentally-triggered release ofactive protein at specific sites under particular conditions. Forexample, changes in pH or presence of a particular protease can modulatedelivery and trigger release of active protein.

Modifications of the leader sequence of a BMP or other TGF-β familymembers can also affect solubility, activity, and expression of theprotein. For example, construct H2528, which utilizes CDMP-3 (thought tobe useful for tendon repair) engineered with a leader sequence as the FBsubdomain of staphylococcus aureus protein A, has improved expression ofthe osteogenic protein.

The skilled artisan will appreciate that the constructs of the presentinvention can be engineered to contain a variety of specialized,functional domains that can be attached to the N-terminus of the TGF-βfamily protein, provided that steric interference and the consequentreduction in biological activity are taken into account. Such constructsmay require at least a minimum spacing of the N-terminal addition fromthe 7-cysteine domain to avoid inhibition of activity or folding. Theskilled artisan will appreciate that minimum spacing requirements willdepend upon the steric properties of the added moiety and the ultimateintended activity of the modified construct, so that both thespecialized domain and the TGF-β family protein will retain theirintended activities.

C. Latent BMPs

The present invention also takes advantage of the surprising discoveryof the extent to which the N-terminus can effect the solubility andactivity of the fusion proteins, since truncations of the OP-1N-terminus had no negative effects on the protein. In addition, thecrystal structure of OP-1 had not revealed any topological informationregarding the N-terminus.

The N-terminal fusion proteins described herein are useful for providinglatent (i.e. inactive) forms of a protein that can be cleaved to producean active protein at a desired time and location. For example, amodified morphogen containing a collagen binding domain (e.g. H2487,shown in FIG. 7A) can be delivered in an inactive form to a desiredtissue locus (e.g. a locus containing an implanted collagen matrix) andcleaved at that locus to produce an active morphogen. Cleavage canresult from conditions endogenous to the target locus (e.g.,naturally-occurring proteases) or can be the result of administration ofspecific proteases or other factors (e.g., acidification of a locus). Inaddition, a very specific protease cleavage site may be engineered,e.g., for a protease found in a fracture site, allowing selective,delayed, and/or gradual activation of OP-1 at the site of implant.

D. Domain Swapping

Additional constructs to alter refolding, solubility, activity andexpression can be designed by replacing the native leader sequence ofone TGF-β superfamily protein with the native leader sequence of anotherTGF-β family member. For example, the construct H2549 has the N-terminusof BMP-2 transposed onto OP-1.

E. Heterodimers

Although some N-terminal fusion protein monomers as described above donot form active homodimers without cleavage of the leader sequence,active heterodimers are formed between those proteins and unmodifiedmonomers of TGF-β family proteins. Accordingly, such heterodimers can beused to provide proteins to a target site by virtue of the N-terminalnon-TGF-β family protein domain attached to the fusion protein, such asa collagen binding domain. Alternatively, design features can be used toenhance purification of heterodimers. Purification can be facilitated byaccentuating purification differences between two kinds of subunits, forinstance, by adding a hexa-histidine. A mixed refolding would provide amixture of two homodimers and the heterodimer, which provides threeseparable species. For example, an N-terminal fusion protein containinga hexa-histidine domain (e.g. H2440, shown in FIG. 7B) which binds anIMAC column, is useful to aid in purification of the fusion protein,which can subsequently be activated by cleavage of the N-terminaldomain.

E. coli expression for construction of heterodimers of the presentinvention is preferred, because the practitioner can adjust the ratio ofeach monomer for optimal yields of heterodimer. In addition, this methodis very rapid. For example, in an in vitro heterodimer formationexperiment between the hexa-histidine tagged OP-1, modified with thepreferred modifications of charged amino acids, E, D, E, and R, (H2440)(see, for example, Attorney Docket No. ______, the entire disclosure ofwhich is incorporated by reference herein) and BMP-2, the yield ofheterodimers were excellent. There is an exceptionally high yield ofheterodimer, more than the theoretically expected 50% heterodimer and25% of each homodimer. This may occur because BMP-2 associates morereadily with OP-1 than with itself, or faster than OP-1 reassociateswith itself. Alternatively, the BMP-2 may act as chaperone for folding.Another experiment also showed heterodimer formation between BMP-2 andthe H2447 mutant, OP-1 (no hexa-his tag), which also associated readily,generating good yields of heterodimer. Heterodimers were also madebetween FB-OP-1 (H2521) and BMP-2. Heterodimers of truncated OP-1, H2469(retaining 15 residues upstream of the first cysteine), and BMP-5(H2475); and H2469 and CDMP-2 (H2471) have also been constructed.

As well as being efficient in refolding, heterodimers of hexa-his-OP-1(H2440) and BMP-2 (H2142) have much greater activity in a ROS assay thanthe homodimers. The hexa-his-OP-1 homodimer had very low activity. Thehomodimer of BMP-2 had better activity. However, OP-1/BMP-2 heterodimerwas far more active than either parent homodimer. In this assay theheterodimer had only about 3-fold less activity than the CHO derivedOP-1 standard. The heterodimer of OP-1 without the hexa-his tag, (H2447)with BMP-2 had similar activity. H2447 is a refolding mutant withmodifications in finger-2 and had relatively lower activity as ahomodimer. Heterodimers of OP-1 (H2469)/BMP-5 (H2475) and OP-1(H2469)/CDMP-2 (H2471) provided a good result on a ROS assay (2.5-3+).

Using this same protocol and methodology, an OP-1/BMP-2 heterodimer wasconstructed, expressed in E. coli, and refolded in vitro. Specifically,H2447/BMP-2 heterodimers and H2440/BMP-2 heterodimers were created by E.coli expression and refolded in vitro under physiological conditions.Based on SDS-PAGE analysis, most of the material readily combined toform a heterodimeric species. Additional species are formed usingheterodimers comprising a non-morphogen domain. Examples of such speciesare N-terminal fused to morphogens, such as collagen binding domainfused to OP-1 (H2487), hexa-histidine fused to OP-1 (H2440), and FBdomain of Protein A fused to OP-1 (H2521), and FB-domain fused to thehexa-histidine/OP-1 construct H2440 (H2525).

Active heterodimers can also be constructed from two BMPs or other TGF-βfamily proteins that were expressed in different systems. Someconstructs are expressed better and are more active when expressed incertain systems over others. One can express each construct in theenvironment best suited for its expression and then form activeheterodimers with them. For example, H2223, a mutant OP-1, is expressedin CHO cells, a mammalian expression system, while H2525 (FIG. 7D),FB-domain OP-1, is best expressed in E. coli, a bacterial expressionsystem.

Further, the activity of the heterodimers can be manipulated by changingthe two proteins used. For example, a heterodimer of H2487, OP-1 with adecapeptide collagen binding site, and CDMP3 can be formed. Thisheterodimer will have an activity different from a H2487 and BMP-2heterodimer.

F. Choice and Optimization of Constructs

As taught herein, the present invention provides the skilled artisanwith the know-how to craft customized chimeric proteins and DNAsencoding the same. Further taught and exemplified herein are the meansto design chimeric proteins having certain desired attribute(s) makingthem suitable for specific in vivo applications (see at least SectionsI.B., II., and III. Examples 1-4, 8 and 11 for exemplary embodiments ofthe foregoing chimeric proteins). For example, chimeric proteins havingaltered solubility attributes can be used in vivo to manipulatemorphogenic effective amounts provided to a recipient. That is,increased solubility can result in increased availability; diminishedsolubility can result in decreased availability. Thus, such systemicallyadministered chimeric proteins can be immediately available/haveimmediate morphogenic effects, whereas locally administered chimericproteins can be available more slowly/have prolonged morphogeniceffects. The skilled artisan will appreciate when increased versusdiminished solubility attributes are preferred given the facts andcircumstances at hand. Optimization of such parameters requires routineexperimentation and ordinary skill.

Similarly, chimeric proteins having altered stability attributes can beused in vivo to manipulate morphogenic effective amounts provided to arecipient. That is, increased stability can result in increasedhalf-life because turnover in vivo is less; diminished stability canresult in decreased half-life and availability because turnover in vivois more. Thus, such systemically administered chimeric proteins caneither be immediately available/have immediate morphogenic effectsachieving a bolus-type dosage or can be available in vivo for prolongedperiods/have prolonged morphogenic effects achieving a sustained releasetype dosage. The skilled artisan will appreciate when increased versusdiminished stability attributes are preferred given the facts andcircumstances at hand. Optimization of such parameters requires routineexperimentation and ordinary skill.

In addition, those protein constructs with altered stability can also beused to practical advantage for improving shelf-life, storage and/orshipping considerations. Furthermore, on a related matter, alteredstability can also directly affect dosage considerations thereby, forexample, reducing the cost of treatment.

Additionally, chimeric proteins having a combination of alteredattributes, such as but not limited to solubility and stabilityattributes, can be used in vivo to manipulate morphogenic effectiveamounts provided to a recipient. That is, by designing a chimericprotein with a combination of specific altered attributes, morphogeniceffective amounts can be administered in a timed-release fashion;dosages can be regulated both in terms of amount and duration; treatmentregimens can be initiated at low doses systemically or locally followedby a transition to high doses, or vice versa; to name but a fewparadigms. The skilled artisan will appreciate when low versus highmorphogenic effective amounts are suitable under the facts andcircumstances at hand. Optimization of such parameters requires routineexperimentation and ordinary skill.

Furthermore, chimeric proteins having one or more altered attributes areuseful to overcome inherent deficiencies in development. Chimericproteins having one or more altered attributes can be designed tocircumvent an inherent defect in a host's native morphogenic signalingsystem. As a non-limiting example, a chimeric protein of the presentinvention can be used to bypass a defect in a native receptor in atarget tissue, a defect in an intracellular signaling pathway, and/or adefect in other events which are reliant on the attributes of asubdomain(s) associated with recognition of a moiety per se as opposedto the attributes associated with function/biological activity which areembodied in a different subdomain(s). The skilled artisan willappreciate when such chimeric proteins are suitable given the facts andcircumstances at hand. Optimization requires routine experimentation andordinary skill.

Practice of the invention will be still more fully understood from thefollowing examples, which are presented herein for illustration only andshould not be construed as limiting the invention in any way.

EXAMPLE 1 Synthesis of a BMP Mutant

FIG. 8 shows the nucleotide and corresponding amino acid sequence forthe OP-1 C-terminal seven cysteine domain. Knowing these sequencespermits identification of useful restriction sites for engineering inmutations by, for example, cassette mutagenesis or the well-known methodof Kunkel (mutagenesis by primer extension using m13-derivedsingle-stranded templates) or by the well-known PCR methods, includingoverlap extension. An exemplary mutant of OP-1 is H2460, with 4 aminoacid changes in the finger 2 sub-domain and an amino acid change in thelast C-terminal amino acid, constructed as described below. It isunderstood by the skilled artisan that the mutagenesis protocoldescribed is exemplary only, and that other means for creating theconstructs of the invention are well-known and well described in theart.

Four amino acid changes were introduced into the OP-1 finger 2sub-domain sequence by means of standard polymerase chain reactionsusing overlap extension technique, resulting in OP-1 mutant H2460. Thefour changes in the finger 2 region were N6>S, R25>E, N26>D and R30>E.This mutant also contained a further change, H35>R, of the C-terminalresidue. The template for these reactions was the mature domain of awild type OP-1 cDNA clone, which had been inserted into an E. coliexpression vector engineered with an ATG start codon at the beginning ofthe mature region. The ATG had been introduced by PCR using as a forwardprimer a synthetic oligonucleotide of the following sequence: ATG TCCACG GGG AGC AAA CAG (SEQ ID NO: 36), encoding M S T G S K Q (SEQ ID NO:37). The PCR reaction was done in combination with an appropriateback-primer complementary to the 3′ coding region of the cDNA.

In order to construct the finger 2 mutant H2460, a PCR fragment encodingthe modified finger-2 was made in a standard PCR reaction, using acommercially available PCR kit and following the manufacturer'sinstructions using as primers synthetic oligonucleotides.

To obtain the N6>S change, a forward primer (primer #1) of the sequenceGCG CCC ACG CAG CTC AGC GCT ATC TCC GTC CTC (SEQ ID NO: 70) was used,encoding the amino acid sequence: A P T Q L S A I S V L (SEQ ID NO: 71).

For the changes near the C-terminus, a back-primer, 43 nucleotides long,(primer #2) was used which introduced the R25>E and N26>D and R30>E andC-terminal H35>R changes. This primer #2 had the sequence: CTA TCT GCAGCC ACA AGC TTC GAC CAC CAT GTC TTC GTA TTT C (SEQ ID NO: 72) which isthe complement of the coding sequence, G AAA TAC GAA GAC ATG GTG GTC GAAGCT TGT GGC TGC AGA TAG (SEQ ID NO: 73) encoding the amino acids: K Y ED M V V E A C G C R stop (SEQ ID NO: 74).

The fragment with finger 2 and C-terminus mutations was then combinedwith another PCR fragment encoding the upstream part of mature OP-1,with N-terminus, finger-1 and heel sub-domains. The latter PCR fragment,encoding the N-terminus, finger 1 and heel sub-domains was constructedagain using an OP-1 expression vector for E. coli as template. Thevector contained an OP-1 cDNA fragment, encoding the mature OP-1 proteinattached to a T7 promoter and ribosome binding site for expression undercontrol of either a T7 promoter in an appropriate host or under controlof a trp promoter. In this T7 expression vector, Pet 3d (Novagen Inc.,Madison Wis.) the sequence between the T7 promoter, at the XbaI site,and the ATG codon of mature OP-1 is as follows:TCTAGAATAATTTTGTTTAACCTTTAAGAAGGAGATATACG ATG (SEQ ID NO: 75).

This second PCR reaction was primed with a forward primer (primer #3)TAA TAC GAC TCA CTA TAG G (SEQ ID NO: 76) which primes in the T7promoter region and a back-primer (primer #4) that overlaps with primer#1 and has the nucleotide sequence GCT GAG CTG CGT GGG CGC (SEQ ID NO:77), which is the complement of the coding sequence GCG CCC ACG CAG CTCAGC (SEQ ID NO: 78), encoding A P T Q L S (SEQ ID NO:79).

In a third PCR reaction, the actual overlap extension reaction, portionsof the above two PCR fragments were combined and amplified by PCR,resulting in a single fragment containing the complete mature OP-1region. For this reaction, primer #3 was used as forward primer and anew primer (primer #5) was used as a back-primer with the followingsequence GG ATC CTA TCT GCA GCC ACA AGC (SEQ ID NO: 80), which is thecomplement to coding sequence GCT TGT GGC TGC AGA TAG GAT CC (SEQ ID NO:81), encoding A C G C R stop (SEQ ID NO: 82). This primer also adds aconvenient 3′ BamHI site for of inserting the gene into the expressionvector.

The resulting fragment bearing the complete mutant gene, resulting fromthe overlap extension PCR, was cloned into a commercial cloning vectordesigned for cloning of PCR fragments, such as pCR2.1-topo-TA(Invitrogen Inc., Carlsbad Calif.). The cloned PCR fragment wasrecovered by restriction digest with XbaI and BamHI and inserted intothe XbaI and BamHI sites of a commercially available T7 expressionvector such as Pet3d (Novagen Inc., Madison Wis.).

EXAMPLE 2 E. coli Expression of a BMP

Transformed cells were grown in standard SPYE 2YT media, 1:1 ratio,(see, Sambrook et al., for example) at 37° C., under standard culturingconditions. Heterologous protein overexpression typically producedinclusion bodies within 8-48 hours. Inclusion bodies were isolated andsolubilized as follows. One liter of culture fluid was centrifuged tocollect the cells. The cells in the resulting pellet then wereresuspended in 60 ml 25 mM Tris, 10 mM EDTA, pH 8.0 (TE Buffer)+100μg/ml lysozyme and incubated at 37° C. for 2 hours. The cell suspensionwas then chilled on ice and sonicated to lyse the cells. Cell lysis wasascertained by microscopic examination. The volume of the lysate wasadjusted to approximately 300 ml with TE Buffer, then centrifuged toobtain an inclusion body pellet. The pellet was washed by 2-4 successiveresuspensions in TE Buffer and centrifugation. The washed inclusion bodypellet was solubilized by denaturation and reduction in 40 ml 100 mMTris, 10 mM EDTA, 6M GuHCl (guanidinium hydrochloride), 250 mM DTT, pH8.8. Proteins then were pre-purified using a standard, commerciallyavailable C2 or C8 cartridge (SPICE cartridges, 400 mg, Ananltech,Inc.). Protein solutions were acidified with 2% TFA (trifluoroaceticacid), applied to the cartridge, washed with 0.1% TFA/10% acetonitrile,and eluted with 0.1% TFA/70% acetonitrile. The eluted material then wasdried down or diluted and fractionated by C4 RP-HPLC.

EXAMPLE 3 Refolding of a BMP Dimer

Proteins prepared as described above were dried down prior to refolding,or diluted directly into refolding buffer. The preferred refoldingbuffer used was: 100 mM Tris, 10 mM EDTA, 1 M NaCl, 2% CHAPS, 5 mM GSH(reduced glutathione), 2.5 mM GSSG (oxidized glutathione), pH 8.5.Refoldings (12.5-200 μg protein/ml) were carried out at 4° C. for 24-90hours, typically 36-48 hours, although longer than this (up to weeks)are expected to provide good refolding in some mutants, followed bydialysis against 0.1% TFA, then 0.01% TFA, 50% ethanol. Aliquots of thedialyzed material then was dried down in preparation for the variousassays.

EXAMPLE 4 Purification and Testing of a Refolded BMP Dimer

4A. SDS-PAGE, HPLC—Samples were dried down and resuspended in Laemmligel sample buffer and then electrophoresed in a 15% SDS-polyacrylamidegel. All assays included molecular weight standards and/or purifiedmammalian cell produced OP-1 for comparison. Analysis of OP-1 dimers wasperformed in the absence of added reducing agents, while OP-1 monomerswere produced by the addition of 100 mM DTT to the gel samples. Foldeddimer has an apparent molecular weight in the range of about 30-36 kDa,while monomeric species have an apparent molecular weight of about 14-16kDa.

Alternatively, samples were chromatographed on a commercially availableRP-HPLC, as follows. Samples were dried down and resuspended in 0.1%TFA/30% acetonitrile. The protein then was applied to a C18 column in0.1% TFA, 30% acetonitrile and fractionated using a 30-60% acetonitrilegradient in TFA. Properly folded dimers elute as a discrete peak at45-50% acetonitrile; monomers elute at 50-60% acetonitrile.

4B. Hydroxyapatite Chromatography—Samples were loaded ontohydroxyapatite in 10 mM phosphate, 6 M urea, pH 7.0 (Column Buffer).Unbound material was removed by washing with column buffer, followed byelution of monomer with Column Buffer+100 mM NaCl. Dimers were elutedwith Column Buffer+250 mM NaCl.

4C. Trypsin Digest—Tryptic digests were performed in a digestion bufferof 50 mM Tris, 4 M urea, 100 mM NaCl, 0.3% Tween 80, 20 mM methylamine,pH 8.0. The ratio of enzyme to substrate was 1:50 (weight to weight).After incubation at 37° C. for 16 hours, 15 μl of digestion mixture wascombined with 5 μl 4× gel sample buffer without DTT and analyzed bySDS-PAGE. Purified mammalian OP-1 and undigested BMP dimer were includedfor comparison. Under these conditions, properly folded dimers arecleaved to produce a species with slightly faster migration thanuncleaved standards, while monomers and mis-folded dimers are completelydigested and do not appear as bands in the stained gel.

EXAMPLE 5 In Vitro Cell-Based Bioassay of Osteogenic Activity

This example demonstrates the bioactivity of morphogen constructs whichhave acquired osteogenic or bone-forming capabilities in accordance withthe present invention. Osteogenic proteins having either an inuateability or an acquired ability for high specific bone forming activitycan induce alkaline phosphatase activity in rat osteoblasts, includingrat osteosarcoma cells and rat calveria cells. In the assay ratosteosarcoma or calveria cells were plated onto a multi-well plate(e.g., a 48 well plate) at a concentration of 50,000 osteoblasts perwell, in αMEM (modified Eagle's medium, Gibco, Inc. Long Island)containing 10% FBS (fetal bovine serum), L-glutamine andpenicillin/streptomycin. The cells were incubated for 24 hours at 37°C., at which time the growth medium was replaced with a MEM containing1% FBS and the cells incubated for an additional 24 hours so that cellswere in serum-deprived growth medium at the time of the experiment.

Cultured cells then were divided into three groups: (1) wells receivingvarious concentrations of biosynthetic ostegenic protein; (2) a positivecontrol, such as mammalian expressed hOP-1; and a negative control (noprotein or TGF-β). The protein concentrations tested were in the rangeof 50-500 ng/ml. Cells were incubated for 72 hours. After the incubationperiod the cell layer was extracted with 0.5 ml of 1% TritonX-100. Theresultant cell extract was centrifuged, 100 μl of the extract was addedto 90 μl of PNPP (paranitrosophenylphosphate)/glycerine mixture andincubated for 30 minutes in a 37° C. water bath and the reaction stoppedwith 100 μl 0.2N NaOH. The samples then were run through a plate reader(e.g., Dynatech MR700) and absorbance measured at 400 nm, usingp-nitrophenol as a standard, to determine the presence and amount ofalkaline phosphatase activity. Protein concentrations were determined bystandard means, e.g., the Biorad method, UV scan or HPLC area at 214 nm.Alkaline phosphatase activity was calculated in units/μg protein, where1 unit equals 1 nmol p-nitrophenol liberated/30 minutes at 37° C.

HOP-1 and BMP2 generate approximately 1.0-1.4 units at between 100-200ng/ml. Other results are provided in Table 1 for the various proteinconstructs.

EXAMPLE 6 In Vitro Cell-Based Bioassay of CDMP Activity

This example demonstrates the bioactivity of constructs which haveacquired enhanced tissue morphogenic capabilities in accordance with thepresent invention. Native CDMPs fail to induce alkaline phosphataseactivity in rat osteosarcoma cells as used in Example 5, but they doinduce alkaline phosphatase activity in the mouse teratocarcinoma cellline ATDC-5, a chondroprogenitor cell line (Atsumi, et al., 1990, CellDifferentiation and Development 30: 109). Folded mutants that arenegative in the rat osteocarcinoma cell assay but positive in the ATDC-5assay are described as having acquired CDMP-like activity. In the ATDC-5assay, cells were plated at density of 4×10⁴ in serum-free basal medium(BM: Ham's F-12/DMEM [1:1] with ITS™+culture supplement [CollaborativeBiomedical Products, Bedford, Mass.], alpha-ketoglutarate (1×10⁻⁴ M),ceruloplasmin (0.25 U/ml), cholesterol (5 μg/ml),phosphatidylethanolamine (2 μg/ml), alpha-tocopherol acid succinate(9×10⁻⁷ M), reduced glutathione (10 μg/ml), taurine (1.25 μg/ml),triiodothyronin (1.6×10⁻⁹ M), parathyroid hormone (5×10⁻¹⁰ M),β-glycerophosphate (10 mM), and L-ascorbic acid 2-sulphate (50 μg/ml)).CDMP or other biosynthetic osteogenic protein (0-300 ng/ml) was addedthe next day and the culture medium, including CDMP or biosyntheticosteogenic protein, replaced every other day. Alkaline phosphataseactivity was determined in sonicated cell homogenates after 4, 6 and/or12 days of treatment. After extensive washing with PBS, cell layers weresonicated in 500 μl of PBS containing 0.05% Triton-X100. 50-100 μlaliquots were assayed for enzyme activity in assay buffer (0.1M sodiumbarbital buffer, pH 9.3) and p-nitrophenyl phosphate as substrate.Absorbance was measured at 400 nm, and activity normalized to proteincontent measured by Bradford protein assay (bovine serum albuminstandard).

CDMP-1 and CDMP-2 generated approximately 2-3 units of activity at day10 at 100 ng/ml. OP-1 generated approximately 6-7 units of activity atday 10 at 100 ng/ml.

EXAMPLE 7 In Vitro Cell-Based Bioassay of TGF-β-Like Activity

This example demonstrates the bioactivity of biosynthetic mutant TGF-βproteins having altered biological capabilities in accordance with theinvention. TGF-β proteins can inhibit epithelial cell proliferation.Numerous cell inhibition assays are well described in the art. See, forexample, Brown, et. al. (1987) J. Immunol. 139:2977, describing acolorimetric assay using human melanoma A375 fibroblast cells, anddescribed herein below. Another assay uses epithelial cells, e.g., minklung epithelial cells, and proliferative effects are determined by³H-thymidine uptake.

Briefly, in the assay the TGF-β biosynthetic construct is seriallydiluted in a multi-well tissue plate containing RPMI-1640 medium (Gibco)and 5% fetal calf serum. Control wells receive medium only. Melanomacells then are added to the well (1.5×10⁴). The plates then areincubated at 37° C. for about 72 hours in 5% CO₂, and the cellmonolayers washed once, fixed and stained with crystalviolet for 15minutes. Unbound stain is washed out and the stained cells then lysedwith 33% acetic acid to release the stain (confined to the cell nuclei),and the OD measured at 590 nm with a standard, commercially availablephotometer to calculate the activity of the test molecules. Theintensity of staining in each well is directly related to the number ofnuclei. Accordingly, active TGF-β molecules are expected to stainlighter than inactive compounds or the negative control well.

In another assay, mink lung cells are used. These cells grow andproliferate under standard culturing conditions, but are arrestedfollowing exposure to TGF-β, as determined by ³H-thymidine uptake usingculture cells from a mink lung epithelial cell line (ATTC No. CCL 64,Rockville, Md.). Briefly cells are grown to confluency with in EMEM,supplemented with 10% FBS, 200 units/ml penicillin, and 200 μg/mlstreptomycin. These cells are cultured to a cell density of about200,000 cells per well. At confluency the media is replaced with 0.5 mlof EMEM containing 1% FBS and penicillin/streptomycin and the cultureincubated for 24 hours at 37° C. Candidate proteins then are added toeach well and the cells incubated for 18 hours at 37° C. Afterincubation, 1.0 μCi of ³H-thymidine in 10 μl was added to each well, andthe cells incubated for four hours at 37° C. The media then is removedfrom each well and the cells washed once with ice-cold phosphatebuffered saline and DNA precipitated by adding 0.5 ml of 10% TCA to eachwell and incubated at room temperature for 15 minutes. The cells arewashed three times with ice-cold distilled water, lysed with 0.5 ml 0.4M NaOH, and the lysate from each well then transferred to ascintillation vial and the radioactivity recorded using a scintillationcounter (Smith-Kline Beckman). Biologically active molecules willinhibit cell proliferation resulting in less thymidine uptake and fewercounts as compared to inactive proteins and/or the negative control well(no added growth factor).

EXAMPLE 8 In Vivo Bioassay of Osteogenic Activity: Endochondral BoneFormation and Related Properties

The art-recognized bioassay for bone induction as described by Sampathand Reddi (Proc. Natl. Acad. Sci. USA (1983) 80:6591-6595) and U.S. Pat.Nos. 4,968,590, 5,266,683, the disclosures of which is hereinincorporated by reference, can be used to establish the efficacy of agiven protein, device or formulation. Briefly, the assay consists ofdepositing test samples in subcutaneous sites in recipient rats underether anesthesia. A vertical incision (1 cm) is made under sterileconditions in the skin over the thoracic region, and a pocket isprepared by blunt dissection. In certain cases, the desired amount ofosteogenic protein (10 ng-10 μg) is mixed with approximately 25 mg ofmatrix material, prepared using standard procedures such aslyophilization, and the test sample is implanted deep into the pocketand the incision is closed with a metallic skin clip. The heterotropicsite allows for the study of bone induction without the possibleambiguities resulting from the use of orthotopic sites. The implantsalso can be provided intramuscularly which places the devices in closercontact with accessable progenitor cells. Typically intramuscularimplants are made in the skeletal muscle of both legs.

The sequential cellular reactions occurring at the heterotropic site arecomplex. The multistep cascade of endochondral bone formation includes:binding of fibrin and fibronectin to implanted matrix, chemotaxis ofcells, proliferation of fibroblasts, differentiation into chondroblasts,cartilage formation, vascular invasion, bone formation, remodeling, andbone marrow differentiation.

Successful implants exhibit a controlled progression through the stagesof protein-induced endochondral bone development including: (1)transient infiltration by polymorphonuclear leukocytes on day one; (2)mesenchymal cell migration and proliferation on days two and three; (3)chondrocyte appearance on days five and six; (4) cartilage matrixformation on day seven; (5) cartilage calcification on day eight; (6)vascular invasion, appearance of osteoblasts, and formation of new boneon days nine and ten; (7) appearance of osteoblastic and bone remodelingon days twelve to eighteen; and (8) hematopoietic bone marrowdifferentiation in the ossicle on day twenty-one.

Histological sectioning and staining is preferred to determine theextent of osteogenesis in the implants. Staining with toluidine blue orhemotoxylin/eosin clearly demonstrates the ultimate development ofendochondral bone. Twelve day bioassays are sufficient to determinewhether bone inducing activity is associated with the test sample.

Additionally, alkaline phosphatase activity and/or total calcium contentcan be used as biochemical markers for osteogenesis. The alkalinephosphatase enzyme activity can be determined spectrophotometricallyafter homogenization of the excised test material. The activity peaks at9-10 days in vivo and thereafter slowly declines. Samples showing nobone development by histology should have no alkaline phosphataseactivity under these assay conditions. The assay is useful forquantitation and obtaining an estimate of bone formation very quicklyafter the test samples are removed from the rat. The results as measuredby alkaline phosphatase activity level and histological evaluation canbe represented as “bone forming units”. One bone forming unit representsthe amount of protein that is needed for half maximal bone formingactivity on day 12. Additionally, dose curves can be constructed forbone inducing activity in vivo at each step of a purification scheme byassaying various concentrations of protein. Accordingly, the skilledartisan can construct representative dose curves using only routineexperimentation.

Total calcium content can be determined after homogenization in, forexample, cold 0.15M NaCl, 3 mM NaHCO₃, pH 9.0, and measuring the calciumcontent of the acid soluble fraction of sediment.

EXAMPLE 9 Activity of “Domain Swapping” Mutant

Domain swapping occurs, for example, when one takes the N-terminalregion of one type of TGF-β family member protein and attaches it to theseven cysteine domain of another type of TGF-β family member protein. Amutant construct was created by splicing the sequence of the BMP-2terminus onto the seven cysteine active domain of OP-1 using routinetechniques generally known to those of ordinary skill in the art. Theresulting mutant, H2549, has an N-terminal region consisting ofMQAKHKQRKRLKSS-C. The last amino acid, cysteine, is the first cysteineof the seven cysteine active domain of OP-1. A ROS assay, as describedabove in Example 5, was used to test activity of H2549.

As illustrated in FIG. 11, the results show that H2549 has very lowactivity as compared to the level of activity of OP-1. However, upontrypsin cleavage of H2549, using a method similar to trypsin cleavage ofdimers described in Example 4, ROS activity is significantly increased.In this manner, the activity of TGF-β family member proteins can beselectively controlled by attaching non-native N-terminal sequences toinactivate it and cleaving the non-native sequences to activate it.

EXAMPLE 10 N-Terminal Truncations Increase Activity

Truncations at the N-terminal regions of modified morphogen proteins,for example by trypsin cleavage, increase ROS activity. Construct H2223is a modified OP-1 mutant expressed in CHO cells. Two HPLC fractions ofH2223 were collected, fractions 13 and 14. An amount of each fractionwas truncated by trypsin cleavage, in a manner similar to that used upondimers in Example 4. The four resulting samples, i.e., fractions 13 and14 untreated with trypsin and fractions 13 and 14 treated with trypsin,were then subjected to a ROS assay, as described in Example 5 above,using OP-1 activity as the standard.

As illustrated in FIG. 12, the activity level of fractions 14 treatedand untreated with trypsin are relatively the same. This is explained byfraction 14 being composed of partially truncated H2223 and, thus,further truncation with trypsin does not alter activity. In contrast,untreated fraction 13 is composed of mainly full length H2223 (i.e., theentire N-terminus of 39 amino acids) and truncation of the N-terminus offraction 13 does increase ROS activity to levels comparable to those offraction 14. These activity levels are well above the ROS activity levelof the OP-1 standard, and demonstrate that improvements in activityobtained with the modified proteins of the present invention.

EXAMPLE 11 Heterodimer Activity

Activity levels of heterodimers are higher than those of the homodimersformed from each of the respective subunits of the heterodimer.Construct H2440, OP-1 with a hexa-his N-terminus, and H2142, BMP-2, wereallowed to form heterodimers and homodimers using the method asdescribed in Example 3 above. Heterodimers of H2440/2142, and homodimersof H2440/2440 and H2142/2142 were then subjected to a ROS assay, asdescribed in Examples 4 and 5 above.

As shown in FIGS. 13A and 13B, the homodimers of H2440, OP-1 with ahexa-his at the N-terminal have very low activity. The homodimers ofH2142, BMP-2, have better activity, but activity is still relativelylow. However, the heterodimer, OP-1 hexa-his and BMP-2, have far greateractivity than either of the homodimers. The heterodimers have only3-fold less activity than the CHO derived OP-1.

In a similar experiment, homodimers and heterodimers were createdbetween H2525, OP-1 with FB leader sequence, and H2142, BMP-2. Thesewere also subjected to a ROS assay with the level of OP-1 activity asthe standard. As illustrated in FIG. 14, homodimers of H2525, OP-1 withFB, have virtually no activity and homodimers of H2142, BMP-2, have verylow activity. In contrast, heterodimers of the two, H2525/2142, haveunexpectedly high activity levels.

1. A biologically active TGF-β family member fusion protein competent torefold under suitable refolding conditions, comprising: a TGF-β familyprotein C-terminal seven cysteine domain, comprising a finger 1subdomain, a finger 2 subdomain, and a heel subdomain; and aheterologous leader sequence domain operatively linked to saidC-terminal domain.
 2. The fusion protein of claim 1 wherein said leadersequence is selected from the group consisting of a tissue-targetingdomain, a molecular-targeting domain, a metal-binding domain, aprotein-binding domain, a ceramic-binding domain, ahydroxyapatite-binding domain, and a collagen-binding domain.
 3. Thefusion protein of claim 2 wherein said tissue-targeting domain binds toa bone matrix protein.
 4. The fusion protein of claim 2 wherein saidtissue-targeting domain binds to a cell surface molecule.
 5. The fusionprotein of claim 4 wherein said cell surface molecule is on anosteoprogenitor cell or a chondrocyte.
 6. A latent TGF-β family memberfusion protein competent to refold under suitable refolding conditions,comprising: a TGF-β family protein C-terminal seven cysteine domain,comprising a finger 1 subdomain, a finger 2 subdomain, and a heelsubdomain; and a cleavable leader sequence operably linked to saidC-terminal domain wherein said leader sequence inhibits the biologicalactivity associated with said C-terminal domain, and wherein saidC-terminal domain becomes active upon cleavage of a part or all of saidleader sequence.
 7. The fusion protein of claim 6 wherein atissue-targeting domain is embedded within said cleavable leadersequence, whereby cleavage of the leader sequence will not cleave saidtissue-targeting domain from said C-terminal domain.
 8. The fusionprotein of claim 1 or 6 wherein said leader sequence is separated fromsaid C-terminal domain by at least seven residues.
 9. The fusion proteinof claim 1 wherein said leader sequence is derived from another TGF-βfamily protein.
 10. A biologically active TGF-β family member proteinmutant competent to refold under suitable refolding conditions,comprising: a TGF-β family member protein C-terminal seven cysteinedomain, comprising a finger 1 subdomain, a finger 2 subdomain, and aheel subdomain; and a leader sequence domain operatively linked to saidC-terminal domain, whereby a part or all of said leader sequence istruncated.
 11. The protein mutant of claim 10 wherein said truncation iscarried out by protease cleavage.
 12. The protein mutant of claim 11wherein said protease is trypsin.
 13. The protein mutant of claim 10wherein said truncation is carried out by chemical cleavage.
 14. Theprotein mutant of claim 13 wherein said chemical cleavage is acidcleavage.
 15. The protein mutant of claim 10 wherein at least one basicresidue of said leader sequence is removed.
 16. The protein mutant ofclaim 10 wherein said protein mutant consists essentially of amino acidsequence SEQ ID NO.
 69. 17. A biologically active heterodimer of TGF-βfamily member proteins, comprising: a first subunit being a TGF-β familymember fusion protein; and a second subunit selected from the groupconsisting of a TGF-β family member fusion protein different from thatof the first subunit and a wild type TGF-β family protein.
 18. Theheterodimer of claim 16, wherein said wild type TGF-β family protein isselected from the group consisting of TGF-β1, TGF-β-2, TGF-β3, TGF-β4,TGF-β5, dpp, Vg-1, Vgr-1, 60A, BMP-2A, BMP-3, BMP-4, BMP-5, BMP-6,Dorsalin, OP-1, OP-2, OP-3, GDF-1, GDF-3, GDF-9, Inhibin α, Inhibin βAand Inhibin βB.
 19. A method of purifying a heterodimer of TGF-β familyproteins, said method comprising: (a) providing a first TGF-β familyprotein subunit; (b) providing a second TGF-β family protein subunitdifferent from said first subunit; (c) mixing said first subunit andsaid second subunit under suitable refolding conditions to generate amixture comprising (i) a first homodimer comprising two of said firstTGF-β family protein subunits; (ii) a second homodimer comprising two ofsaid second TGF-β family protein subunits; and (iii) a heterodimercomprising one of said first TGF-β family subunits and one of saidsecond TGF-β family subunits; wherein said heterodimer is separable fromsaid first homodimer and said second homodimer; and (d) separating saidheterodimer from said first homodimer and said second homodimer.